What will become of Big Data?

I’m often asked what I think will happen to Big Data over the next five to ten years. From a Developer’s point of view, they’re asking if investing their time in becoming a Data Engineer will pay off. We’re going to see a continuing maturity of Big Data technologies. There will be better stories on […]

Why should or shouldn’t you become a Data Engineer?

You’re considering a change to become a Data Engineer. Why should you do it? Why shouldn’t you do it? Let’s consider some reasons. Should There is a major shortage of qualified Data Engineers. There is a high demand and low supply of qualified Data Engineers. You can make an extra $20,000 to $60,000 per year […]

Q and A: How can I tear out Informatica and MySQL and put in Big Data?

Today’s blog post comes from a question from a subscriber on my mailing list. The question come from G.P.: I need to gain a hands on understanding of these technologies.  I’m going to have to build some demonstration pilots before I would get any traction.  I’m the VP of Analytics, so the engineering team think […]

Getting Stuck Crawling with Big Data

I always encourage companies to break down their Big Data projects into smaller pieces. I call this process crawl, walk, run. There is an interesting corollary to this process. Some companies get stuck at the crawl phase and don’t progress on to the walk and run phases. The first time I saw this, I was […]

Strata+Hadoop World and Trends

Last week, I gave two talks about Strata+Hadoop World. These talks covered some of the up and coming technologies in Big Data. I describe Strata as the Super Bowl of Big Data conferences. This is where you’ll find the best minds talking about the present and future conditions of Big Data. My first session was […]

Solving the First and Last Mile Problem With Kafka Part 2

In the first post in the series, I talked about Big Data’s first and last mile problems. I showed how the first mile problems could be solved with Kafka. In this post, I’m going to talk about the last mile problems. Big Data Last Mile With Big Data we’re faced with finding value in large […]

Solving the First and Last Mile Problem With Kafka Part 1

In telecommunications, there is the term “last mile”. It refers to getting the connection to customer. It’s the last mile between the company’s infrastructure and the customer’s location. We have similar issues in Big Data. We don’t just have a last mile problem; we have a first mile problem too. We have an issue with […]

How Programmers Should Start Viewing Training

There’s a sad thing that’s limiting our growth as programmers. We (programmers) don’t invest in ourselves like other professions. The business person will happily spend $3,000 to attend a class to improve a part of their business. The marketing person will spend $3,000 to attend a class to improve their marketing and sales copy. A […]

On complexity in big data

After years of teaching Big Data, I’ve come up with the best explanation of why it isn’t easy, cheap, or quick. I wrote the in-depth piece published on O’Reilly.

Q and A: Is a Data Engineer the same thing as a BI or DBA?

Today’s blog post comes from a question from a subscriber to my mailing list. The question come from Alpesh D.: I have been getting your emails and they all seem to make sense. However, did I understand it correct that you believe all big data engineers need to be to use Java? I come from […]