PHP Authors: Liz McMillan, Carmen Gonzalez, Hovhannes Avoyan, Lori MacVittie, Trevor Parsons

News Feed Item

Cloudera Powers Opt-In Machine Learning Project for Real-Time Identification of Suicide Risk Factors in Military Veterans

Patterns and Predictions' Durkheim Project Uses Predictive Analytics Across Data Sources

PALO ALTO, CA -- (Marketwired) -- 09/25/13 -- Cloudera, the leader in enterprise analytic data management powered by Apache Hadoop™, today announced that Patterns and Predictions, a predictive analytics company, partnered with Cloudera for an ongoing initiative applying machine learning to the identification of key correlations between military veterans' communications and suicide risk. The Durkheim Project, as it is called, entails opt-in monitoring across a variety of online and mobile data channels to predict which military veterans are at the highest risk of suicide. It is powered by a real-time risk detection framework co-developed with Cloudera and built on CDH (Cloudera's Distribution Including Apache Hadoop), Cloudera Impala and Cloudera Search.

"The promise of the Durkheim Project is expressed in its ability to collect, monitor and deliver insights from a diverse repository of complex data, including mobile and social media signals, with the hope of eventually providing real-time triage of interventional actions upon detection of a critical event," said Patterns and Predictions founder Chris Poulin. "Cloudera's unique software and expertise enable us to make risk assessments faster and across larger data sets, resulting in better clinical outcomes."

Applied Machine Learning Identifies and Predicts Mental Health Risk Factors
Patterns and Predictions' founder Chris Poulin began working with Dartmouth researchers in 2010 to address the problem of high suicide rates among veterans. Suicide rates among U.S. veterans are approximately twice that of the general population, a challenging phenomenon facing the U.S. Department of Veterans Affairs (VA).

With support from the Defense Advanced Research Project Agency (DARPA), a research arm of the Department of Defense (DoD), and Dartmouth College, the suicide risk prediction project includes a database of more than 100,000 U.S. veterans, all of whom have volunteered their participation. By mining these veterans' social media posts and other indicators, Patterns and Predictions -- together with a team of experts in artificial intelligence, medical professionals from private companies, and the U.S. Department of Veterans Affairs (VA) -- developed a set of predictive indicators of suicide risks for military veterans.

The tightly integrated machine learning system was trained by feeding in isolated statistical indicators -- keywords, word patterns and other linguistic clues known to be associated with people who needed help -- from a variety of veterans' data sources. Words and linguistic patterns that veterans post online are data-mined for indicators of suicidal behavior and the system identifies useful clues in real data to establish a risk "score."

With so many veteran participants, the data sets are very large. The veterans who opt into the project receive a unique Facebook app and a mobile app for either the iOS or Android operating system; these are designed to capture posts, Tweets, mobile uploads and geographic location. Additional profile data is captured as well, including physician information and clinical notes. To ensure compliance with various privacy and HIPAA regulations, all captured data is stored in a secure environment behind a medical firewall.

Open Source Hadoop Infrastructure Delivers Operational Efficiency for Critical Research
The Durkheim Project has a highly complex workflow, requiring foundational infrastructure and predictive modeling that supports big data collection and analysis at scale. Moreover, the team wanted to access all of the machine learning through search interfaces, which can get expensive since all of the machine learning is indexed.

The technical objective for building the machine learning data fabric underpinning the initiative was maximum speed at minimum cost. Poulin found most big data solutions to be low performance in terms of accuracy, or highly complex in implementation and/or in integration with Patterns and Predictions' existing IT environment. Poulin chose to build on Apache Hadoop for its abstraction of underlying data set complexity and selected Cloudera for its category leadership and subject matter expertise in the Hadoop framework, open source and big data infrastructure. CDH, the market-leading, 100% open source distribution of Hadoop and related projects, as the cornerstone technology of the Durkheim Project. Using Cloudera Impala and Cloudera Search, the ingestion of data on Hadoop is markedly more efficient, delivering lower costs, better computational throughput and reduced complexity of IT support.

Patterns and Predictions engaged Cloudera Professional Services to co-develop code in the area of real-time prediction on CDH, called Bayesian Counters. The use of text analytics against the continuously fed large data pool delivers an exponential number of variables which can then be compared and analyzed, resulting in a real-time assessment of the participant's mental health. The computational processing to analyze that data requires a big data fabric, and the benefit is that the output is much more informative.

In the Future, Data Could Help Veterans in Crisis
In February 2013, an investigation conducted by Patterns and Predictions, Dartmouth and the VA determined that the accuracy of this risk-prediction data model was statistically significant, with "consistent accuracies" of 65% percent or higher in predicting suicide risk in a veteran control group.

Still in its initial phases, the Durkheim Project is authorized only to monitor and analyze data. While the project has delivered statistically valid results that accurately predict suicide risk in a control group of veterans, its critical research is restricted, at least for the time being, to a non-interventional protocol. Using Cloudera, the project's continued scaling of risk classifiers will help to establish the necessary confidence in the project's ability to assess risk in real time, as they currently apply for an interventional study.

About Patterns and Predictions
Patterns and Predictions is a predictive analytics firm. Its core Centiment® technology provides unstructured and linguistics driven prediction. It is the technology powering the Durkheim Project's 'big data' analytics network for the assessment of mental health risks. Partners include Bloomberg, The Geisel School of Medicine at Dartmouth, Cloudera, and Attivio. Funding sources include the U.S. Government (DARPA), and customers include Global 100 companies.

About Cloudera
Founded in 2008, Cloudera pioneered the business case for Hadoop with CDH, the world's most comprehensive, thoroughly tested and widely deployed 100% open source distribution of Apache Hadoop in both commercial and non-commercial environments. Now, the company is redefining data management with its Platform for Big Data, Cloudera Enterprise, empowering enterprises to Ask Bigger Questions™ and gain rich, actionable insights from all their data, to quickly and easily derive real business value that translates into competitive advantage. As the top contributor to the Apache open source community and leading educator of data professionals with the broadest array of Hadoop training and certification programs, Cloudera also offers comprehensive consulting services. Over 700 partners across hardware, software and services have teamed with Cloudera to help meet organizations' big data goals. With tens of thousands of nodes under management and hundreds of customers across diverse markets, Cloudera is the category leader that has set the standard for Hadoop in the enterprise. www.cloudera.com

Connect with Cloudera
Read our blog: http://www.cloudera.com/blog/
Follow us on Twitter: http://twitter.com/cloudera
Visit us on Facebook: http://www.facebook.com/cloudera

Add to Digg Bookmark with del.icio.us Add to Newsvine

Press Contacts

North America
Hope Nicora
Bhava Communications for Cloudera
[email protected]

Richard Botley
Ketchum for Cloudera
[email protected]
+44 (0) 20 7611 3788

More Stories By Marketwired .

Copyright © 2009 Marketwired. All rights reserved. All the news releases provided by Marketwired are copyrighted. Any forms of copying other than an individual user's personal reference without express written permission is prohibited. Further distribution of these materials is strictly forbidden, including but not limited to, posting, emailing, faxing, archiving in a public database, redistributing via a computer network or in a printed form.

IoT & Smart Cities Stories
Moroccanoil®, the global leader in oil-infused beauty, is thrilled to announce the NEW Moroccanoil Color Depositing Masks, a collection of dual-benefit hair masks that deposit pure pigments while providing the treatment benefits of a deep conditioning mask. The collection consists of seven curated shades for commitment-free, beautifully-colored hair that looks and feels healthy.
The textured-hair category is inarguably the hottest in the haircare space today. This has been driven by the proliferation of founder brands started by curly and coily consumers and savvy consumers who increasingly want products specifically for their texture type. This trend is underscored by the latest insights from NaturallyCurly's 2018 TextureTrends report, released today. According to the 2018 TextureTrends Report, more than 80 percent of women with curly and coily hair say they purcha...
The textured-hair category is inarguably the hottest in the haircare space today. This has been driven by the proliferation of founder brands started by curly and coily consumers and savvy consumers who increasingly want products specifically for their texture type. This trend is underscored by the latest insights from NaturallyCurly's 2018 TextureTrends report, released today. According to the 2018 TextureTrends Report, more than 80 percent of women with curly and coily hair say they purcha...
We all love the many benefits of natural plant oils, used as a deap treatment before shampooing, at home or at the beach, but is there an all-in-one solution for everyday intensive nutrition and modern styling?I am passionate about the benefits of natural extracts with tried-and-tested results, which I have used to develop my own brand (lemon for its acid ph, wheat germ for its fortifying action…). I wanted a product which combined caring and styling effects, and which could be used after shampo...
The platform combines the strengths of Singtel's extensive, intelligent network capabilities with Microsoft's cloud expertise to create a unique solution that sets new standards for IoT applications," said Mr Diomedes Kastanis, Head of IoT at Singtel. "Our solution provides speed, transparency and flexibility, paving the way for a more pervasive use of IoT to accelerate enterprises' digitalisation efforts. AI-powered intelligent connectivity over Microsoft Azure will be the fastest connected pat...
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
Codete accelerates their clients growth through technological expertise and experience. Codite team works with organizations to meet the challenges that digitalization presents. Their clients include digital start-ups as well as established enterprises in the IT industry. To stay competitive in a highly innovative IT industry, strong R&D departments and bold spin-off initiatives is a must. Codete Data Science and Software Architects teams help corporate clients to stay up to date with the mod...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
Druva is the global leader in Cloud Data Protection and Management, delivering the industry's first data management-as-a-service solution that aggregates data from endpoints, servers and cloud applications and leverages the public cloud to offer a single pane of glass to enable data protection, governance and intelligence-dramatically increasing the availability and visibility of business critical information, while reducing the risk, cost and complexity of managing and protecting it. Druva's...
BMC has unmatched experience in IT management, supporting 92 of the Forbes Global 100, and earning recognition as an ITSM Gartner Magic Quadrant Leader for five years running. Our solutions offer speed, agility, and efficiency to tackle business challenges in the areas of service management, automation, operations, and the mainframe.