This past week I read some really great articles that I thought that I would share.
- The Netflix Media Database (Part 1) (Part 2) –
I really enjoyed that they drilled into some of the details and design decisions that they made in this solution. This is a good example of building a polyglot system and using multiple data stores to address different parts of the system. I will be really interested to follow this series to see how they address the challenges that they still have.
- How to choose a Spark Data Abstraction (Link)
This was a topic that came up at the most recent meeting of the San Diego Tech Immersion SIG regarding Spark data options and which ones to choose. An interesting point that the author doesn’t mention is that if you are looking for Python support, you cannot use the Dataset API as that is only supported in Java and Scala (Link).
- Examination of the CDO Role
This is a combination of a few articles that are all related to the Gartner survey of CDOs (Chief Data Officers). I don’t have access to the full survey (Press Release), but a few articles here (Tech Republic), Top 3 Findings (Link), and CDO Decoded by EgonZehnder give some of the interesting numbers. Some items that I found interesting to note
- Eighty-six percent of respondents ranked “defining data and analytics strategy for the organization” as their top responsibility, up from 64% in 2016.
- The majority (80%) of CDOs said that successfully evolving their company culture was either more difficult or much more difficult than expected, the report found.
- Some 68% said the integration of data and eliminating silos were their biggest challenges. ( I guess we still haven’t and probably will never figure this out )
I hope you find these interesting