Statistical Data Analysis
Right Start: Big Data Projects
Big Data has exploded onto the scene as a tremendous opportunity for companies across most major industries to gain a competitive advantage. But Big Data is not a small project that should be taken lightly. It represents a business imperative requiring active participation by business leaders and their teams, as well as by technical leaders and their teams.
Getting Big Data right is challenging. Gartner predicts “Through 2015, more than 85%of Fortune 500 organizations will fail to effectively exploit Big Data for competitive advantage.“
What factors are critical to getting it right?
Having planned and implemented Big Data projects for companies including NASDAQ, Facebook, NetApp, EMC, Western Digital, Johnson & Johnson, Intel, Ancestry.com and eBay, we’ve developed a maturity model for organizational adoption (Figure 1 below). Let’s look at how businesses can best move forward at each stage of adoption, and where the missteps can happen.
Before we get started, here are three “Must Do’s” for any organization at any stage of Big Data adoption:
- Test and learn: An agile approach with rapid releases enables organizations to fine-tune their projects while they’re in progress. Traditional legacy systems were better suited to a “waterfall” approach, where technology was introduced all at once. Big Data projects should focus on specific business goals and allow cross-pollination of ideas to better understand what’s possible, making a “rapid release” approach much better.
- Incremental adoption: Build a center of competency and cross-pollinate expertise among business experts, data scientists, and data engineers. This enables business units to leverage a common talent pool and a shared approach, eliminating the risk of data silos, providing for common governance, and avoiding redundant storage and processing by different departments.
- Change management: Think about key stakeholders for the initiative, understand their concerns, get their buy in and invest in early pilot systems that demonstrate the value that can be generated through a Big Data investment.
Moving from Business Intelligence to Scale and Cost Savings
The second stage of adoption of Big Data is about laying a foundation where data can be stored and accessed by a larger audience, allowing for better scale and favorable economics to support creating value at a greater volume, variety and velocity than previously possible. To succeed:
- Avoid the “Big Data plateau.” If you store and process data sets that are only useful for generating reports or feeding a data warehouse, it can be hard to engage the business. Look to include a combination of data sets that will allow future business value, not just IT efficiency, e.g., combining web, social, product log, text, geo, mobile or other new data sets with structured relational data about customers and transactions. Prepare for agile analytics by getting the business engaged in the planning discussions early on.
- Don’t wait. Instead of trying to sell a big project to get started, leverage existing budgets and pick low hanging fruit that will provide immediate value.
- Get help in planning and executing. Big Data requires different skills, different data processing, different ways of storing and modeling data, and different governance skills. It’s important to do more than just train teams – it takes time to learn the patterns and practices so getting experienced help is crucial for early success.
- Hold your ground. New technologies can be frustrating to bring online and are often abandoned or stall out. Investment of time and budget in a Big Data project is necessary and it is important to stay the course.
Moving from Scale and Cost to Agile Analytics and Insights
The third stage of adoption of Big Data is about empowering the business to get meaningful insights in a much more agile way, allowing answers to data in hours not weeks. To build on the success of an initial Big Data platform it’s important to:
- Pick the right pilot: low hanging fruit with meaningful impact, not a “castle in the sky”. Focus on projects that can generate business value in eight weeks so key constituents are able to see for themselves the value of Big Data.
- Pick up steam after success. After the launch of a first successful project, follow a portfolio approach: invest in smaller projects that build on the existing applications, which can result in more value as well as larger, greater projects. Bring new tools and data sets to extend systems incrementally – instead of a “big bang” delivery, think about the smallest set of technologies and data sets that can produce results, and then keep extending incrementally based on business needs.
- Stop speculating. Big Data lets you stop wasting time and energy in speculative projects that are meant to parse, summarize, and organize data in data warehouses and data marts. Instead, you can explore and investigate new concepts with raw data, letting data scientists and analysts test out and refine approaches quickly, then build standardized support AFTER an analysis or dimension is proven in value.
- Remember: Data is not the enemy. Traditionally companies have a bunker mentality to the increasing deluge of data, finding ways to condense and delete data. With Big Data, the better mindset is to embrace data as an asset, and retain it for analysis. Furthermore, there’s tremendous value in seeking out new data sets from external sources (vendors, partners, government, open data sets and otherwise) to provide better insights.
Moving from Agile Analytics and Insights to Business Innovation
The fourth stage of adoption is about driving business innovation by integrating Big Data predictive analytics into business processes, to automate decision processes and even to begin gaining insights that enhance products. To make this work it’s important to:
- Emphasize your goals so data scientists and the business have a shared understanding of the problem. This lets you invest effort in activities that will move the needle, rather than working on second order problems.
- Test and test again. Test quickly – there’s no substitute for real-world experience when testing out new ideas. With Big Data it’s possible to prepare many tests and use A/B testing principles to run parallel experiments at small scale, so you can double down on approaches that are working quickly.
- Invest in data. Companies often get stuck with limited visibility so in order to avoid this, invest in data. In other words, instrument your applications effectively and run control groups so you can measure improvement and get insight into how alternative strategies fare. Having a lab sandbox to simulate expected results based on the best data available is important to keep innovating.
Moving from Business Innovation to Organizational Transformation
The final stage of adoption is about transforming your culture so data is central to decisions and empowers new opportunities. To allow this:
- Don’t accept assertions, require analysis and evidence. To instill a data-driven culture it’s important to reward those who make valuable discoveries based on your data, and challenge intuition by asking: “How can we measure that”?
- Think outside the box. Big Data is an exciting contributor to reinventing the way things “used to be done”. Integrate new data, simulate and measure throughout the discovery, assessment and refinement of new products and features. By doing so, organizations are opening themselves up to a green field of opportunity.
In conclusion, Big Data represents a major opportunity for the enterprise. The journey will take time but by making the right choices you can get value quickly and avoid pitfalls. By moving forward thoughtfully and with determination, you have the opportunity to gain significant advantages from this major new business innovation.
Ron Bodkin is Founder & CEO of Think Big Analytics
Business Intelligence Software
Nearly every organization today uses analytics, but not every organization is getting as much out of its analytics as it should be. So, how do you
Statistical Data Analysis
Cloned and legacy data, plus legacy applications kept in service solely to access historical information, create additional storage, search, and eD