SlideShare uma empresa Scribd logo
1 de 54
Baixar para ler offline
Dark Data
Alex Pongpech
Graduate School of Applied Statistics
NIDA
Dark Data
Alex Pongpech
• Even bigger than big
And so The dark
Those who
lead the fight
Turning Dark
• Useful data may become dark data after it becomes irrelevant, as it is not
processed fast enough. This is called "perishable insights" in "live flowing
data".
• For example, geolocation of a customer, fraud detection
• According to IBM, about 60 percent of data loses its value immediately.
• IBM estimate that roughly 90 percent of data generated by sensors and
analog-to-digital conversions never get used.
• Not analysing data immediately and letting it go 'dark' can lead to
significant losse
• Not only must processed fast enough but also must act quick enough
Turning DARK
• Organizations retain dark data for a multitude
of reasons, and it is estimated that most
companies are only analyzing 1% of their data.
• A lot of dark data is unstructured, which means
that the information is in formats that may be
difficult to categorise, be read by the computer
and thus analysed.
• Often the reason that business do not analyse
their dark data is because of the amount of
resources it would take and the difficulty of
having that data analysed.
• Because storage is inexpensive, storing data is
easy. However, storing and securing the data
usually entails greater expenses (or even risk)
than the potential return profit.
Why dark data is handled the way it is?
• It is surprising because at the time of data collection, the companies
assume that the data is going to provide value. Companies invest a lot
on data collection so both monetarily and otherwise, data should be
considered important. Here are a few reasons why there is so much
of dark data
Why dark data is handled the way it is?
1. Lopsided priorities data on how the customer arrived at the
application page.
2. Disconnect among departments may not be known to other
departments. This is the way we do it here
3. Technology and tool constraints If data collection is done by
separate technologies and tools in the same organization, it may be
difficult to integrate audio file contents from call center with click
data from websites.
Shed some light on the DARK
Gartner defines dark data as
• the information assets organizations
• collect,
• process
• and store during regular business activities,
• but generally fail to use for other purposes (for example, analytics,
business relationships and direct monetizing).
• In an industrial context, dark data can include information gathered by
sensors and telematics.
• Similar to dark matter in physics, dark data often comprises most
organizations’ universe of information assets.
• Thus, organizations often retain dark data for compliance purposes only.
Storing and securing data typically incurs more expense (and sometimes
greater risk) than value.
Dark Data Example:
IP Location
• a manufacturer of soft drinks which runs a popular website might
think that, of all the data that they have, only those that are
directly relevant to the marketing and sales of their soft drink
products have any value for them. While they also store many
other data, such as the IP location of their users, they fail to see
how these "dark" data can also have value to their company.
• Yet if their data, properly cleansed to a high quality and then
analysed, reveal that 7% of the users of their website are
accessing their service from outside the country where they are
located, in spite of the fact that the product is only directly sold
to retailers within that country, these are in themselves valuable
data, for instance, to those who target ads at users of soft drinks.
• These dark data could also be seen as an opportunity to think
about marketing their product elsewhere. For instance, if 40% of
users from outside the country where the company was located
access their site from India, according to the IP location data,
while only 4% came from the European Union, it would strongly
suggest that a marketing campaign within Europe would have
considerably less chance of success than one aimed at the Indian
sub-continent.
Other Dark Data Examples:
type of device
• Other typical examples of dark data, which
most websites store, but fail to utilize the
value of, include the type of device one
accesses the Internet from, typically a
smartphone, tablet or computer; the web-
browser the Internet is being accessed
through, eg Chrome, Mozilla, Opera, Edge
or IE, among others, and even more
obscure or dark information such as the
number of times users re-set their
password, which would be useful to a
company which specializes in Internet and
password security.
Other Dark Data Examples:
Customer Feedback
• A well-known example of dark data which
goes to waste is where companies have a
feedback form which allows users to give
feedback concerning their website or
service but then they don't have the data
structures in place which allow these data
to be easily analysed, resulting in a failure
to take on board and act on their users
judgments and criticisms, whether positive
or negative (both of which have value), that
users make about their site or service.
How does data go DARK
Customer Feedback
Customer Information Systems
More dark data examples
• Customer Information
• Log Files
• Account Information
• Previous Employee Data
• Financial Statements
• Raw Survey Data
• Email Correspondences
• Notes or Presentations
• Old Versions of Relevant Documents
The dark bites
• Maybe we can use all this data later? This also explains why many
organizations are reluctant to part with dark data, even if they have
no plans to put it to work on their behalf, either in the near term or
further down the planning horizon.
• The dark can bite, organizations must also be aware that the dark
data they possess – or perhaps more chillingly,
• the dark data about them, their customers and their operations that's stored
in the cloud, outside their immediate control and management – can pose
risks to their continued business health and well-being.
Problems from the dark
• Data stored but not used cost money ( NYT says 90% of energy used
by data centers is waster)
• Stored data costs money, according to Datamation by 2020 unused
but stored can add up to $891 billion
• The more data is stored but not used, the higher the risk specially in
privacy
The risks
1. Legal and regulatory risk. If data covered by mandate or regulation
– such as confidential, financial information (credit card or other
account data) or patient records – appears anywhere in dark data
collections, its exposure could involve legal and financial liability.
2. Intelligence risk. If dark data encompasses proprietary or sensitive
information reflective of business operations, practices, competitive
advantages, important partnerships and joint ventures, and so
forth, inadvertent disclosure could adversely affect the bottom line
or compromise important business activities and relationships.
The risks
3. Reputation risk. Any kind of data breach reflects badly on the
organizations affected thereby. This applies as much to dark data
(especially in light of other risks) as to other kinds of breaches.
4. Opportunity costs. Given that the organization has decided not to invest
in analysis and mining of dark data by definition, concerted efforts by
third parties to exploit its value represent potential losses of intelligence
and value based upon its contents.
5. Open-ended exposure. By definition, dark data contains information
that's either too difficult or costly to extract to be mined, or that contains
unknown (and therefore unevaluated) sources of intelligence and
exposure to loss or harm. Dark data's secrets may be very dark and
damaging indeed, but one has no way of knowing for sure.
Mitigating Risks Posed by Dark Data
1. Know where is dark---ongoing inventory and assessment.
2. Turn dark to light ---drive ongoing research into new tools and
technologies
3. understanding where dark data resides, how it's stored, how it's
protected and what kinds of access controls help maintain its security.
4. No man land –Ubiquitous encryption.No dark data should be readily
accessible to casual inspection, under any circumstances.
5. Don’t stay in the dark too long ---Retention policies and safe disposal.
6. Auditing dark data for security purposes.
What are some other major areas in which dark
data is being underutilized besides underutilized
customer information?
• Education and Healthcare.
• The potential to service students and patients in the manner in which
the consumer and financial services pursue their target population is
huge.
• So much paperwork is involved in both education and academics, so
the data is there—and in the age of electronic health records
government incentives, much of it in the healthcare space is now
digital.
• However, it needs to be mined and analyzed in order to lead to
opportunities that effect the change which usually results from the
strategic use of personal and behavioral data.
What kind of businesses can really benefit
from dark data extraction and processing?
• Business that sells a product, service or idea—anyone who has
customers—can benefit.
• How many times
• a user resets their password IP address when a user logs into your
website/app
• Last email communication date to your customers
• Mobile handset type, or web browser version
• Free text feedback on a hotel stay or recent flight
• Additional passengers or guest names on a ticket or hotel room
• These data points or features are often overlooked by marketing
teams as serving any useful purpose, as there is a perception that this
type of information is only collected for compliance, fraud or
regulatory requirements
How old is too old when it comes to dark
data?
• Nothing is ever too old unless it is too old
• That said, if you’re analyzing, say, customer sentiment in social media,
you simply won’t have relevant data that predates the advent of
social channels. So in that case, dark data from before those channels
existed could be considered “too old.”
How can you turn dark data into active,
revenue generating data?
• This is where data science, marketing, and business intelligence need to get their
heads together to find new ways of activating dark data to provide new
opportunities for the organization. While dark data can appear dull and
uninteresting on the surface; there are methods to turn it into highly granular,
rich customer insights.
• Here are a few key steps to get you started on the above examples:
• Log-ins to your website or mobile application, what city/country are the IP
addresses? Are you logging each location a user visit and creating a virtual map of
their travels? This is particularly compelling when creating a 360-degree view of
your customer.
• Additional passengers/guest names on a reservation. Not only does this give
insight to homophily of the user and fuel your social network graph of which
users are centrally connected and influential, but it also provides rich insight into
their family and workplace. Link this data with social graphing, and you’ll quickly
obtain age, gender, and behavioral traits.
How can you turn dark data into active,
revenue generating data?
• Mobile phone data. This simple piece of data will illuminate an array of
new product and marketing opportunities, and provide an additional
segmentation layer to improve marketing effectiveness. From mobile
phone data, it’s possible to know which telco partners you should bring on
board (which will activate even MORE opportunities), you’ll know where
your users are in the world, in real time if they have recently purchased
tickets with another airline, and more.
• Free text input, such as feedback can be passed through cognitive text
analysis tools to determine if the general sentiment of the feedback is
positive of negative. Linking the user profile to your internal database can
also determine if this user is sending mixed messages on social media
compared to surveys and feedback forms. THINK AIRLINE
Four Ways to Use Dark Data
1. Networking machine data. As noted above, servers, firewalls, network
monitoring tools and other parts of your environment generate large
amounts of machine data related to network operations. Avoid dark
networking data by using this information to analyze network security, as
well as to monitor network activity patterns to ensure that your network
infrastructure is never under- or over-utilized.
2. Customer support logs. Most businesses maintain records of customer-
support interactions that include information such as when a customer
contacted the business, which type of communication channel was used,
how long the engagement lasted and so on. Don’t make the mistake of
leaving this data in the dark, or using it only when you need to research a
customer issue. Instead, build it into your analytics workflows by
leveraging it to help understand when your customers are most likely to
contact you, what their preferred methods of contact are and so on.
Four Ways to Use Dark Data
3. “Legacy” system log. If you have mainframes or other older types of
systems running in your environment, you may think that there is no way
to use modern analytics tools to understand them. But you can. By
offloading system logs and other data from these systems into an
analytics platform like Hadoop, you can make sure you are not leaving
this “legacy” data in the dark.
4. Non-textual data. Most data analytics workflows are built around textual
data, which is easier to ingest. You can also make use of video, audio or
other non-textual files, however. You can analyze the meta data
associated with them, or, if appropriate, translate speech to text in order
to gain more insight into the content of the data itself. The effort
required in this regard may not be worth it in all cases, but the bigger
point worth keeping in mind is that your non-textual data doesn’t have
to be dark data. There are ways to make it actionable if you need it to be.
LET THERE BE LIGHT: Dark Data Analytics
• Dark analytics efforts typically focus on three dimensions:
1. Untapped data already in your possession
2. Nontraditional unstructured data
3. Data in the deep web
• o be clear, the purpose of dark analytics is not to catalog vast volumes of
unstructured data. Casting a broader data net without a specific purpose in
mind will likely lead to failure. Indeed, dark analytics efforts that are
surgically precise in both intent and scope often deliver the greatest value.
Like every analytics journey, successful efforts begin with a series of
specific questions. What problem are you solving? What would we do
differently if we could solve that problem? Finally, what data sources and
analytics capabilities will help us answer the first two questions?
DeepDive
• http://deepdive.stanford.edu/quickstart
• DeepDive is a system to extract value from dark data. Like dark matter, dark data
is the great mass of data buried in text, tables, figures, and images, which lacks
structure and so is essentially unprocessable by existing software.
• DeepDive helps bring dark data to light by creating structured data (SQL tables)
from unstructured information (text documents) and integrating such data with
an existing structured database.
• DeepDive is used to extract sophisticated relationships between entities and
make inferences about facts involving those entities.
• DeepDive helps one process a wide variety of dark data and put the results into a
database. With the data in a database, one can use a variety of standard tools
that consume structured data; e.g., visualization tools like Tablaeu or analytics
tools like Excel.
• http://deepdive.stanford.edu/showcase/apps
Lessons from the front lines
• IU HEALTH’S RX FOR MINING DARK DATA
• Retailers make it personal
• Oil Company
IU HEALTH’S RX FOR MINING DARK DATA
• As part of a new model of care, Indiana
University Health (IU Health) is exploring
ways to use nontraditional and unstructured
data to personalize health care for
individual patients and improve overall
health outcomes for the broader
population.
• Traditional relationships between medical
care providers and patients are often
transactional in nature, focusing on
individual visits and specific outcomes
rather than providing holistic care services
on an ongoing basis. IU Health has
determined that incorporating insights from
additional data will help build patient
loyalty and provide more useful, seamless,
and cost-efficient care.
IU HEALTH’S RX FOR MINING DARK DATA
• “IU Health needs a 360-degree understanding of the
patients it serves in order to create the kind of care and
services that will keep them in the system
• For example, consider the voluminous free-form notes—
both written and verbal—that physicians generate
during patient consultations.
• Deploying voice recognition, deep learning, and text
analysis capabilities to these in-hand but previously
underutilized sources could potentially add more depth
and detail to patient medical records.
• These same capabilities might also be used to analyze
audio recordings of patient conversations with IU Health
call centers to further enhance a patient’s records. Such
insights could help IU Health develop a more thorough
understanding of the patient’s needs, and better
illuminate how those patients utilize the health system’s
services.
IU HEALTH’S RX FOR MINING DARK DATA
• Another opportunity involves using dark data to help predict need and manage care
across populations. IU Health is examining how cognitive computing, external data, and
patient data could help identify patterns of illness, health care access, and historical
outcomes in local populations. The approaches could make it possible to incorporate
socioeconomic factors that may affect patients’ engagement with health care providers.
• “There may be a correlation between high density per living unit and disengagement
from health,” says Mark Lantzy, senior vice president and chief information officer, IU
Health. “It is promising that we can augment patient data with external data to
determine how to better engage with people about their health. We are creating the
underlying platform to uncover those correlations and are trying to create something
more systemic.
• The destination for our journey is an improved patient experience,” he continues.
“Ultimately, we want it to drive better satisfaction and engagement. More than deliver
great health care to individual patients, we want to improve population health
throughout Indiana as well. To be able to impact that in some way, even incrementally,
would be hugely beneficial.”
Retailers make it personal
• Retailers almost universally recognize that digital has reshaped customer
behavior and shopping. In fact, $0.56 of every dollar spent in a store is
influenced by a digital interaction.
• Yet many retailers—particularly those with brick-and-mortar operations—
still struggle to deliver the digital experiences customers expect. Some
focus excessively on their competitors instead of their customers and rely
on the same old key performance indicators and data.
• In recent years, however, growing numbers of retailers have begun
exploring different approaches to developing digital experiences. Some are
analyzing previously dark data culled from customers’ digital lives and
using the resulting insights to develop merchandising, marketing, customer
service, and even product development strategies that offer shoppers a
targeted and individualized customer experience.
Retailers make it personal
• Stitch Fix, for example, is an online subscription shopping service that
uses images from social media and other sources to track emerging
fashion trends and evolving customer preferences.
• Its process begins with clients answering a detailed questionnaire
about their tastes in clothing. Then, with client permission, the
company’s team of 60 data scientists augments that information by
scanning images on customers’ Pinterest boards and other social
media sites, analyzing them, and using the resulting insights to a
develop a deeper understanding of each customer’s sense of style.
• Company stylists and artificial intelligence algorithms use these
profiles to select style-appropriate items of clothing to be shipped to
individual customers at regular intervals.
Retailers make it personal
• Meanwhile, grocery supermarket chain Kroger Co. is taking a different
approach that leverages Internet of Things and advanced analytics
techniques. As part of a pilot program, the company is embedding a
network of sensors and analytics into store shelves that can interact
with the Kroger app and a digital shopping list on a customer’s phone.
• As the customer strolls down each aisle, the system—which contains
a digital history of the customer’s purchases and product
preferences—can spotlight specially priced products the customer
may want on 4-inch displays mounted in the aisles. This pilot, which
began in late 2016 with initial testing in 14 stores, is expected to
expand in 2017.
GREG POWERS, VICE PRESIDENT OF TECHNOLOGY,
HALLIBURTON
• Yet the sheer volume of information that we can and do collect goes way
beyond human cognitive bandwidth. Advances in sensor science are
delivering enormous troves of both dark data and what I think of as really
dark data.
• For example, we scan rocks electromagnetically to determine their
consistency. We use nuclear magnetic resonance to perform what amounts
to an MRI on oil wells. Neutron and gamma-ray analysis measures the
electrical permittivity and conductivity of rock. Downhole spectroscopy
measures fluids. Acoustic sensors collect 1–2 terabytes of data daily.
• All of this dark data helps us better understand in-well performance. In
fact, there’s so much potential value buried in this darkness that I flip the
frame and refer to it as “bright data” that we have yet to tap.
GREG POWERS, VICE PRESIDENT OF TECHNOLOGY,
HALLIBURTON
• In the next phase of Halliburton’s ongoing analytics program, we want to develop
the capacity to capture, mine, and use bright data insights to become more
predictive.
• Given the nature of our operations, this will be no small task. Identical events
driven by common circumstances are rare in the oil and gas industry. We have 30
years of retrospective data, but there are an infinite number of combinations of
rock, gas, oil, and other variables that affect outcomes.
• Unfortunately, there is no overarching constituent physics equation that can
describe the right action to take for any situation encountered. Yet, even if we
can’t explain what we’ve seen historically, we can explore what has happened
and let our refined appreciation of historic data serve as a road map to where we
can go.
• In other words, we plan to correlate data to things that statistically seem to
matter and, then, use this data to develop a confidence threshold to inform how
we should approach these issues.
GREG POWERS, VICE PRESIDENT OF TECHNOLOGY,
HALLIBURTON
• We believe that nontraditional data holds the key to creating advanced intelligent
response capabilities to solve problems, potentially without human intervention, before
they happen.
• At the lowest level, we’ll take measurements and tell someone after the fact that
something happened. At the next level, our goal will be to recognize that something has
happened and, then, understand why it happened. The following step will use real-time
monitoring to provide in-the-moment awareness of what is taking place and why. In the
next tier, predictive tools will help us discern what’s likely to happen next. The most
extreme offering will involve automating the response—removing human intervention
from the equation entirely.
• Drilling is complicated work. To make it more autonomous and efficient, and to free
humans from mundane decision making, we need to work smarter. Our industry is facing
a looming generational change. Experienced employees will soon retire and take with
them decades of hard-won expertise and knowledge. We can’t just tell our new hires,
“Hey, go read 300 terabytes of dark data to get up to speed.” We’re going to have to rely
on new approaches for developing, managing, and sharing data-driven wisdom.
Where do you start?
Ask the right questions:
• Rather than attempting to discover and inventory all of the dark data
hidden within and outside your organization, work with business teams to
identify specific questions they want answered. Work to identify potential
dark analytics sources and the untapped opportunities contained therein.
• Then focus your analytics efforts on those data streams and sources that
are particularly relevant.
• For example, if marketing wants to boost sales of sports equipment in a
certain region, analytics teams can focus their efforts on real-time sales
transaction streams, inventory, and product pricing data at select stores
within the target region. They could then supplement this data with
historic unstructured data—in-store video analysis of customer foot traffic,
social sentiment, influencer behavior, or even pictures of displays or
product placement across sites—to generate more nuanced insights.
Look outside your organization:
• You can augment your own data with publicly available demographic,
location, and statistical information. Not only can this help your
analytics teams generate more expansive, detailed reports—it can put
insights in a more useful context.
• For example, a physician makes recommendations to an asthma
patient based on her known health history and a current examination.
By reviewing local weather data, he can also provide short-term
solutions to help her through a flare-up during pollen season. In
another example, employers might analyze data from geospatial
tools, traffic patterns, and employee turnover to determine the
extent to which employee job satisfaction levels are being adversely
impacted by commute times.
Augment data talent:
• Data scientists are an increasingly valuable resource, especially those who
can artfully combine deep modeling and statistical techniques with
industry or function-specific insights and creative problem framing. Going
forward, those with demonstrable expertise in a few areas will likely be in
demand.
• For example, both machine learning and deep learning require
programmatic expertise—the ability to build established patterns to
determine the appropriate combination of data corpus and method to
uncover reasonable, defensible insights. Likewise, visual and graphic design
skills may be increasingly critical given that visually communicating results
and explaining rationales are essential for broad organizational adoption.
• Finally, traditional skills such as master data management and data
architecture will be as valuable as ever—particularly as more companies
begin laying the foundations they’ll need to meet the diverse, expansive,
and exploding data needs of tomorrow.
Explore advanced visualization tools:
• Not everyone in your organization will be able to digest a printout of
advanced Bayesian statistics and apply them to business practices.
• Most people need to understand the “so what” and the “why” of complex
analytical insights before they can turn insight into action. In many
situations, information can be more easily digested when presented as an
infographic, a dashboard, or another type of visual representation.
• Visual and design software packages can do more than generate eye-
catching graphics such as bubble charts, word clouds, and heat maps—they
can boost business intelligence by repackaging big data into smaller, more
meaningful chunks, delivering value to users much faster. Additionally, the
insights (and the tools) can be made accessible across the enterprise,
beyond the IT department, and to business users at all levels, to create
more agile, cross-functional teams.
View it as a business-driven effort:
• It’s time to recognize analytics as an overall business strategy rather than
as an IT function. To that end, work with C-suite colleagues to garner
support for your dark analytics approach.
• Many CEOs are making data a cornerstone of overall business strategy,
which mandates more sophisticated techniques and accountability for
more deliberate handling of the underlying assets.
• By understanding your organization’s agenda and goals, you can determine
the value that must be delivered, define the questions that should be
asked, and decide how to harness available data to generate answers.
• Data analytics then becomes an insight-driven advantage in the
marketplace. The best way to help ensure buy-in is to first pilot a project
that will demonstrate the tangible ROI that can be realized by the
organization with a businesswide analytics strategy.
Think broadly:
• As you develop new capabilities and strategies, think about how you
can extend them across the organization as well as to customers,
vendors, and business partners. Your new data strategy becomes part
of your reference architecture that others can use.
Thank you
and
May the light shines on you
References
1. http://www.kdnuggets.com/2015/11/importance-dark-data-big-data-
world.html
2. http://www.kdnuggets.com/2015/01/shining-light-on-dark-data.html
3. http://www.kdnuggets.com/2016/03/rise-dark-data-how-
harnessed.html
4. http://www.kdnuggets.com/solutions/fraud-detection.html
5. https://en.wikipedia.org/wiki/Operational_database
6. http://blog.syncsort.com/2017/05/big-data/4-dark-data-examples-use-
cases/
7. Tracie Kambies, Paul Roma, Nitin Mittal, Sandeep Kumar Sharma,
https://dupress.deloitte.com/dup-us-en/focus/tech-trends/2017/dark-
data-analyzing-unstructured-data.html

Mais conteúdo relacionado

Mais procurados

Static Indeterminacy and Kinematic Indeterminacy
Static Indeterminacy and Kinematic IndeterminacyStatic Indeterminacy and Kinematic Indeterminacy
Static Indeterminacy and Kinematic IndeterminacyDarshil Vekaria
 
solving statically indeterminate stucture using stiffnes method
solving statically indeterminate stucture using stiffnes methodsolving statically indeterminate stucture using stiffnes method
solving statically indeterminate stucture using stiffnes methodSyed Md Soikot
 
Stiffness method of structural analysis
Stiffness method of structural analysisStiffness method of structural analysis
Stiffness method of structural analysisKaran Patel
 
International Decades for Natural Disaster Reduction ( IDNDR )
International Decades for Natural Disaster 		          Reduction ( IDNDR )International Decades for Natural Disaster 		          Reduction ( IDNDR )
International Decades for Natural Disaster Reduction ( IDNDR )Jemishkumar Parmar
 
Design consideration Of Earth Dams
Design consideration Of Earth DamsDesign consideration Of Earth Dams
Design consideration Of Earth DamsRosul Ahmed
 
Module1 flexibility-1- rajesh sir
Module1 flexibility-1- rajesh sirModule1 flexibility-1- rajesh sir
Module1 flexibility-1- rajesh sirSHAMJITH KM
 
STRUCTURAL ANALYSIS (LECTURE 1-2).pptx
STRUCTURAL ANALYSIS (LECTURE 1-2).pptxSTRUCTURAL ANALYSIS (LECTURE 1-2).pptx
STRUCTURAL ANALYSIS (LECTURE 1-2).pptxRomOnline1
 
Dams and Reserviors
Dams and ReserviorsDams and Reserviors
Dams and ReserviorsTarun kumar
 
Conference paper subgrade reaction
Conference paper subgrade reactionConference paper subgrade reaction
Conference paper subgrade reactionabdulhakim mawas
 
FINITE ELEMENT ANALYSIS OF BEAM COLUMN JOINT
FINITE ELEMENT ANALYSIS OF BEAM COLUMN JOINTFINITE ELEMENT ANALYSIS OF BEAM COLUMN JOINT
FINITE ELEMENT ANALYSIS OF BEAM COLUMN JOINTDivyansh Mittal
 
Soil structure interaction amec presentation-final
Soil structure interaction amec presentation-finalSoil structure interaction amec presentation-final
Soil structure interaction amec presentation-finalAhmad Hallak PEng
 
Unit 12 River training work.pdf
Unit 12 River training work.pdfUnit 12 River training work.pdf
Unit 12 River training work.pdfBittuRajkumar
 
blast resistant structures
blast resistant structuresblast resistant structures
blast resistant structuressitaramayya
 
Non destructive testing in concrete
Non destructive testing in concreteNon destructive testing in concrete
Non destructive testing in concreteHEMANT AVHAD
 

Mais procurados (20)

Static Indeterminacy and Kinematic Indeterminacy
Static Indeterminacy and Kinematic IndeterminacyStatic Indeterminacy and Kinematic Indeterminacy
Static Indeterminacy and Kinematic Indeterminacy
 
solving statically indeterminate stucture using stiffnes method
solving statically indeterminate stucture using stiffnes methodsolving statically indeterminate stucture using stiffnes method
solving statically indeterminate stucture using stiffnes method
 
river training work
river training workriver training work
river training work
 
Design of concrete beam
Design of concrete beamDesign of concrete beam
Design of concrete beam
 
Stiffness method of structural analysis
Stiffness method of structural analysisStiffness method of structural analysis
Stiffness method of structural analysis
 
International Decades for Natural Disaster Reduction ( IDNDR )
International Decades for Natural Disaster 		          Reduction ( IDNDR )International Decades for Natural Disaster 		          Reduction ( IDNDR )
International Decades for Natural Disaster Reduction ( IDNDR )
 
Base shear understand
Base shear understandBase shear understand
Base shear understand
 
Design consideration Of Earth Dams
Design consideration Of Earth DamsDesign consideration Of Earth Dams
Design consideration Of Earth Dams
 
Module1 flexibility-1- rajesh sir
Module1 flexibility-1- rajesh sirModule1 flexibility-1- rajesh sir
Module1 flexibility-1- rajesh sir
 
STRUCTURAL ANALYSIS (LECTURE 1-2).pptx
STRUCTURAL ANALYSIS (LECTURE 1-2).pptxSTRUCTURAL ANALYSIS (LECTURE 1-2).pptx
STRUCTURAL ANALYSIS (LECTURE 1-2).pptx
 
Dams and Reserviors
Dams and ReserviorsDams and Reserviors
Dams and Reserviors
 
Conference paper subgrade reaction
Conference paper subgrade reactionConference paper subgrade reaction
Conference paper subgrade reaction
 
Sr. Larbi Sennour
Sr. Larbi SennourSr. Larbi Sennour
Sr. Larbi Sennour
 
FINITE ELEMENT ANALYSIS OF BEAM COLUMN JOINT
FINITE ELEMENT ANALYSIS OF BEAM COLUMN JOINTFINITE ELEMENT ANALYSIS OF BEAM COLUMN JOINT
FINITE ELEMENT ANALYSIS OF BEAM COLUMN JOINT
 
Soil structure interaction amec presentation-final
Soil structure interaction amec presentation-finalSoil structure interaction amec presentation-final
Soil structure interaction amec presentation-final
 
report
reportreport
report
 
Unit 12 River training work.pdf
Unit 12 River training work.pdfUnit 12 River training work.pdf
Unit 12 River training work.pdf
 
blast resistant structures
blast resistant structuresblast resistant structures
blast resistant structures
 
Non destructive testing in concrete
Non destructive testing in concreteNon destructive testing in concrete
Non destructive testing in concrete
 
Freebodydigram
FreebodydigramFreebodydigram
Freebodydigram
 

Semelhante a Dark data

Dark Data Revelation and its Potential Benefits
Dark Data Revelation and its Potential BenefitsDark Data Revelation and its Potential Benefits
Dark Data Revelation and its Potential BenefitsPromptCloud
 
Master Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security FundamentalsMaster Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security FundamentalsSarah Fane
 
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackYour AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackPrecisely
 
Putting data science into perspective
Putting data science into perspectivePutting data science into perspective
Putting data science into perspectiveSravan Ankaraju
 
Understanding Dark Data
Understanding Dark DataUnderstanding Dark Data
Understanding Dark DataAhmed Banafa
 
Impact of data science in financial reporting
Impact of data science in financial reporting Impact of data science in financial reporting
Impact of data science in financial reporting James Deiotte
 
Hidden security and privacy consequences around mobility (Infosec 2013)
Hidden security and privacy consequences around mobility (Infosec 2013)Hidden security and privacy consequences around mobility (Infosec 2013)
Hidden security and privacy consequences around mobility (Infosec 2013)Huntsman Security
 
Anatomy Of A Breach: The Good, The Bad & The Ugly
Anatomy Of A Breach: The Good, The Bad & The UglyAnatomy Of A Breach: The Good, The Bad & The Ugly
Anatomy Of A Breach: The Good, The Bad & The UglyResilient Systems
 
Piwik PRO The Real Cost of Data Privacy
Piwik PRO The Real Cost of Data Privacy Piwik PRO The Real Cost of Data Privacy
Piwik PRO The Real Cost of Data Privacy Piwik PRO
 
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docx
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docxDATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docx
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docxSteveNgigi2
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Dataijtsrd
 
A Cybersecurity Planning Guide for CFOs
A Cybersecurity Planning Guide for CFOsA Cybersecurity Planning Guide for CFOs
A Cybersecurity Planning Guide for CFOsgppcpa
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurersdipak sahoo
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedShradha Verma
 
Is Bad Data Killing Your Customer Engagement Strategy?
Is Bad Data Killing Your Customer Engagement Strategy? Is Bad Data Killing Your Customer Engagement Strategy?
Is Bad Data Killing Your Customer Engagement Strategy? Marketo
 
Module 2 - Improving current business with your own data - Online
Module 2 - Improving current business with your own data - Online Module 2 - Improving current business with your own data - Online
Module 2 - Improving current business with your own data - Online caniceconsulting
 

Semelhante a Dark data (20)

Dark Data Revelation and its Potential Benefits
Dark Data Revelation and its Potential BenefitsDark Data Revelation and its Potential Benefits
Dark Data Revelation and its Potential Benefits
 
Master Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security FundamentalsMaster Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security Fundamentals
 
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackYour AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
 
Putting data science into perspective
Putting data science into perspectivePutting data science into perspective
Putting data science into perspective
 
Understanding Dark Data
Understanding Dark DataUnderstanding Dark Data
Understanding Dark Data
 
Impact of data science in financial reporting
Impact of data science in financial reporting Impact of data science in financial reporting
Impact of data science in financial reporting
 
Hidden security and privacy consequences around mobility (Infosec 2013)
Hidden security and privacy consequences around mobility (Infosec 2013)Hidden security and privacy consequences around mobility (Infosec 2013)
Hidden security and privacy consequences around mobility (Infosec 2013)
 
Sensitive Data Assesment
Sensitive Data AssesmentSensitive Data Assesment
Sensitive Data Assesment
 
Anatomy Of A Breach: The Good, The Bad & The Ugly
Anatomy Of A Breach: The Good, The Bad & The UglyAnatomy Of A Breach: The Good, The Bad & The Ugly
Anatomy Of A Breach: The Good, The Bad & The Ugly
 
Piwik PRO The Real Cost of Data Privacy
Piwik PRO The Real Cost of Data Privacy Piwik PRO The Real Cost of Data Privacy
Piwik PRO The Real Cost of Data Privacy
 
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docx
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docxDATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docx
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docx
 
A data-centric program
A data-centric program A data-centric program
A data-centric program
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Data
 
ii mca juno
ii mca junoii mca juno
ii mca juno
 
A Cybersecurity Planning Guide for CFOs
A Cybersecurity Planning Guide for CFOsA Cybersecurity Planning Guide for CFOs
A Cybersecurity Planning Guide for CFOs
 
BREACHED: Data Centric Security for SAP
BREACHED: Data Centric Security for SAPBREACHED: Data Centric Security for SAP
BREACHED: Data Centric Security for SAP
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurers
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_published
 
Is Bad Data Killing Your Customer Engagement Strategy?
Is Bad Data Killing Your Customer Engagement Strategy? Is Bad Data Killing Your Customer Engagement Strategy?
Is Bad Data Killing Your Customer Engagement Strategy?
 
Module 2 - Improving current business with your own data - Online
Module 2 - Improving current business with your own data - Online Module 2 - Improving current business with your own data - Online
Module 2 - Improving current business with your own data - Online
 

Mais de Worapol Alex Pongpech, PhD (9)

Blockchain based Customer Relation System
Blockchain based Customer Relation SystemBlockchain based Customer Relation System
Blockchain based Customer Relation System
 
Fast analytics kudu to druid
Fast analytics  kudu to druidFast analytics  kudu to druid
Fast analytics kudu to druid
 
Apache Kafka
Apache Kafka Apache Kafka
Apache Kafka
 
Building business intuition from data
Building business intuition from dataBuilding business intuition from data
Building business intuition from data
 
10 basic terms so you can talk to data engineer
10 basic terms so you can  talk to data engineer10 basic terms so you can  talk to data engineer
10 basic terms so you can talk to data engineer
 
Why are we using kubernetes
Why are we using kubernetesWhy are we using kubernetes
Why are we using kubernetes
 
Airflow 4 manager
Airflow 4 managerAirflow 4 manager
Airflow 4 manager
 
Fast Analytics
Fast Analytics Fast Analytics
Fast Analytics
 
In15orlesss hadoop
In15orlesss hadoopIn15orlesss hadoop
In15orlesss hadoop
 

Último

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 

Último (20)

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 

Dark data

  • 1. Dark Data Alex Pongpech Graduate School of Applied Statistics NIDA
  • 2. Dark Data Alex Pongpech • Even bigger than big
  • 3. And so The dark
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Turning Dark • Useful data may become dark data after it becomes irrelevant, as it is not processed fast enough. This is called "perishable insights" in "live flowing data". • For example, geolocation of a customer, fraud detection • According to IBM, about 60 percent of data loses its value immediately. • IBM estimate that roughly 90 percent of data generated by sensors and analog-to-digital conversions never get used. • Not analysing data immediately and letting it go 'dark' can lead to significant losse • Not only must processed fast enough but also must act quick enough
  • 12. Turning DARK • Organizations retain dark data for a multitude of reasons, and it is estimated that most companies are only analyzing 1% of their data. • A lot of dark data is unstructured, which means that the information is in formats that may be difficult to categorise, be read by the computer and thus analysed. • Often the reason that business do not analyse their dark data is because of the amount of resources it would take and the difficulty of having that data analysed. • Because storage is inexpensive, storing data is easy. However, storing and securing the data usually entails greater expenses (or even risk) than the potential return profit.
  • 13. Why dark data is handled the way it is? • It is surprising because at the time of data collection, the companies assume that the data is going to provide value. Companies invest a lot on data collection so both monetarily and otherwise, data should be considered important. Here are a few reasons why there is so much of dark data
  • 14. Why dark data is handled the way it is? 1. Lopsided priorities data on how the customer arrived at the application page. 2. Disconnect among departments may not be known to other departments. This is the way we do it here 3. Technology and tool constraints If data collection is done by separate technologies and tools in the same organization, it may be difficult to integrate audio file contents from call center with click data from websites.
  • 15. Shed some light on the DARK Gartner defines dark data as • the information assets organizations • collect, • process • and store during regular business activities, • but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). • In an industrial context, dark data can include information gathered by sensors and telematics. • Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. • Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value.
  • 16. Dark Data Example: IP Location • a manufacturer of soft drinks which runs a popular website might think that, of all the data that they have, only those that are directly relevant to the marketing and sales of their soft drink products have any value for them. While they also store many other data, such as the IP location of their users, they fail to see how these "dark" data can also have value to their company. • Yet if their data, properly cleansed to a high quality and then analysed, reveal that 7% of the users of their website are accessing their service from outside the country where they are located, in spite of the fact that the product is only directly sold to retailers within that country, these are in themselves valuable data, for instance, to those who target ads at users of soft drinks. • These dark data could also be seen as an opportunity to think about marketing their product elsewhere. For instance, if 40% of users from outside the country where the company was located access their site from India, according to the IP location data, while only 4% came from the European Union, it would strongly suggest that a marketing campaign within Europe would have considerably less chance of success than one aimed at the Indian sub-continent.
  • 17. Other Dark Data Examples: type of device • Other typical examples of dark data, which most websites store, but fail to utilize the value of, include the type of device one accesses the Internet from, typically a smartphone, tablet or computer; the web- browser the Internet is being accessed through, eg Chrome, Mozilla, Opera, Edge or IE, among others, and even more obscure or dark information such as the number of times users re-set their password, which would be useful to a company which specializes in Internet and password security.
  • 18. Other Dark Data Examples: Customer Feedback • A well-known example of dark data which goes to waste is where companies have a feedback form which allows users to give feedback concerning their website or service but then they don't have the data structures in place which allow these data to be easily analysed, resulting in a failure to take on board and act on their users judgments and criticisms, whether positive or negative (both of which have value), that users make about their site or service.
  • 19. How does data go DARK Customer Feedback
  • 21.
  • 22. More dark data examples • Customer Information • Log Files • Account Information • Previous Employee Data • Financial Statements • Raw Survey Data • Email Correspondences • Notes or Presentations • Old Versions of Relevant Documents
  • 23. The dark bites • Maybe we can use all this data later? This also explains why many organizations are reluctant to part with dark data, even if they have no plans to put it to work on their behalf, either in the near term or further down the planning horizon. • The dark can bite, organizations must also be aware that the dark data they possess – or perhaps more chillingly, • the dark data about them, their customers and their operations that's stored in the cloud, outside their immediate control and management – can pose risks to their continued business health and well-being.
  • 24. Problems from the dark • Data stored but not used cost money ( NYT says 90% of energy used by data centers is waster) • Stored data costs money, according to Datamation by 2020 unused but stored can add up to $891 billion • The more data is stored but not used, the higher the risk specially in privacy
  • 25. The risks 1. Legal and regulatory risk. If data covered by mandate or regulation – such as confidential, financial information (credit card or other account data) or patient records – appears anywhere in dark data collections, its exposure could involve legal and financial liability. 2. Intelligence risk. If dark data encompasses proprietary or sensitive information reflective of business operations, practices, competitive advantages, important partnerships and joint ventures, and so forth, inadvertent disclosure could adversely affect the bottom line or compromise important business activities and relationships.
  • 26. The risks 3. Reputation risk. Any kind of data breach reflects badly on the organizations affected thereby. This applies as much to dark data (especially in light of other risks) as to other kinds of breaches. 4. Opportunity costs. Given that the organization has decided not to invest in analysis and mining of dark data by definition, concerted efforts by third parties to exploit its value represent potential losses of intelligence and value based upon its contents. 5. Open-ended exposure. By definition, dark data contains information that's either too difficult or costly to extract to be mined, or that contains unknown (and therefore unevaluated) sources of intelligence and exposure to loss or harm. Dark data's secrets may be very dark and damaging indeed, but one has no way of knowing for sure.
  • 27. Mitigating Risks Posed by Dark Data 1. Know where is dark---ongoing inventory and assessment. 2. Turn dark to light ---drive ongoing research into new tools and technologies 3. understanding where dark data resides, how it's stored, how it's protected and what kinds of access controls help maintain its security. 4. No man land –Ubiquitous encryption.No dark data should be readily accessible to casual inspection, under any circumstances. 5. Don’t stay in the dark too long ---Retention policies and safe disposal. 6. Auditing dark data for security purposes.
  • 28. What are some other major areas in which dark data is being underutilized besides underutilized customer information? • Education and Healthcare. • The potential to service students and patients in the manner in which the consumer and financial services pursue their target population is huge. • So much paperwork is involved in both education and academics, so the data is there—and in the age of electronic health records government incentives, much of it in the healthcare space is now digital. • However, it needs to be mined and analyzed in order to lead to opportunities that effect the change which usually results from the strategic use of personal and behavioral data.
  • 29. What kind of businesses can really benefit from dark data extraction and processing? • Business that sells a product, service or idea—anyone who has customers—can benefit. • How many times • a user resets their password IP address when a user logs into your website/app • Last email communication date to your customers • Mobile handset type, or web browser version • Free text feedback on a hotel stay or recent flight • Additional passengers or guest names on a ticket or hotel room • These data points or features are often overlooked by marketing teams as serving any useful purpose, as there is a perception that this type of information is only collected for compliance, fraud or regulatory requirements
  • 30. How old is too old when it comes to dark data? • Nothing is ever too old unless it is too old • That said, if you’re analyzing, say, customer sentiment in social media, you simply won’t have relevant data that predates the advent of social channels. So in that case, dark data from before those channels existed could be considered “too old.”
  • 31. How can you turn dark data into active, revenue generating data? • This is where data science, marketing, and business intelligence need to get their heads together to find new ways of activating dark data to provide new opportunities for the organization. While dark data can appear dull and uninteresting on the surface; there are methods to turn it into highly granular, rich customer insights. • Here are a few key steps to get you started on the above examples: • Log-ins to your website or mobile application, what city/country are the IP addresses? Are you logging each location a user visit and creating a virtual map of their travels? This is particularly compelling when creating a 360-degree view of your customer. • Additional passengers/guest names on a reservation. Not only does this give insight to homophily of the user and fuel your social network graph of which users are centrally connected and influential, but it also provides rich insight into their family and workplace. Link this data with social graphing, and you’ll quickly obtain age, gender, and behavioral traits.
  • 32. How can you turn dark data into active, revenue generating data? • Mobile phone data. This simple piece of data will illuminate an array of new product and marketing opportunities, and provide an additional segmentation layer to improve marketing effectiveness. From mobile phone data, it’s possible to know which telco partners you should bring on board (which will activate even MORE opportunities), you’ll know where your users are in the world, in real time if they have recently purchased tickets with another airline, and more. • Free text input, such as feedback can be passed through cognitive text analysis tools to determine if the general sentiment of the feedback is positive of negative. Linking the user profile to your internal database can also determine if this user is sending mixed messages on social media compared to surveys and feedback forms. THINK AIRLINE
  • 33. Four Ways to Use Dark Data 1. Networking machine data. As noted above, servers, firewalls, network monitoring tools and other parts of your environment generate large amounts of machine data related to network operations. Avoid dark networking data by using this information to analyze network security, as well as to monitor network activity patterns to ensure that your network infrastructure is never under- or over-utilized. 2. Customer support logs. Most businesses maintain records of customer- support interactions that include information such as when a customer contacted the business, which type of communication channel was used, how long the engagement lasted and so on. Don’t make the mistake of leaving this data in the dark, or using it only when you need to research a customer issue. Instead, build it into your analytics workflows by leveraging it to help understand when your customers are most likely to contact you, what their preferred methods of contact are and so on.
  • 34. Four Ways to Use Dark Data 3. “Legacy” system log. If you have mainframes or other older types of systems running in your environment, you may think that there is no way to use modern analytics tools to understand them. But you can. By offloading system logs and other data from these systems into an analytics platform like Hadoop, you can make sure you are not leaving this “legacy” data in the dark. 4. Non-textual data. Most data analytics workflows are built around textual data, which is easier to ingest. You can also make use of video, audio or other non-textual files, however. You can analyze the meta data associated with them, or, if appropriate, translate speech to text in order to gain more insight into the content of the data itself. The effort required in this regard may not be worth it in all cases, but the bigger point worth keeping in mind is that your non-textual data doesn’t have to be dark data. There are ways to make it actionable if you need it to be.
  • 35. LET THERE BE LIGHT: Dark Data Analytics • Dark analytics efforts typically focus on three dimensions: 1. Untapped data already in your possession 2. Nontraditional unstructured data 3. Data in the deep web • o be clear, the purpose of dark analytics is not to catalog vast volumes of unstructured data. Casting a broader data net without a specific purpose in mind will likely lead to failure. Indeed, dark analytics efforts that are surgically precise in both intent and scope often deliver the greatest value. Like every analytics journey, successful efforts begin with a series of specific questions. What problem are you solving? What would we do differently if we could solve that problem? Finally, what data sources and analytics capabilities will help us answer the first two questions?
  • 36. DeepDive • http://deepdive.stanford.edu/quickstart • DeepDive is a system to extract value from dark data. Like dark matter, dark data is the great mass of data buried in text, tables, figures, and images, which lacks structure and so is essentially unprocessable by existing software. • DeepDive helps bring dark data to light by creating structured data (SQL tables) from unstructured information (text documents) and integrating such data with an existing structured database. • DeepDive is used to extract sophisticated relationships between entities and make inferences about facts involving those entities. • DeepDive helps one process a wide variety of dark data and put the results into a database. With the data in a database, one can use a variety of standard tools that consume structured data; e.g., visualization tools like Tablaeu or analytics tools like Excel. • http://deepdive.stanford.edu/showcase/apps
  • 37. Lessons from the front lines • IU HEALTH’S RX FOR MINING DARK DATA • Retailers make it personal • Oil Company
  • 38. IU HEALTH’S RX FOR MINING DARK DATA • As part of a new model of care, Indiana University Health (IU Health) is exploring ways to use nontraditional and unstructured data to personalize health care for individual patients and improve overall health outcomes for the broader population. • Traditional relationships between medical care providers and patients are often transactional in nature, focusing on individual visits and specific outcomes rather than providing holistic care services on an ongoing basis. IU Health has determined that incorporating insights from additional data will help build patient loyalty and provide more useful, seamless, and cost-efficient care.
  • 39. IU HEALTH’S RX FOR MINING DARK DATA • “IU Health needs a 360-degree understanding of the patients it serves in order to create the kind of care and services that will keep them in the system • For example, consider the voluminous free-form notes— both written and verbal—that physicians generate during patient consultations. • Deploying voice recognition, deep learning, and text analysis capabilities to these in-hand but previously underutilized sources could potentially add more depth and detail to patient medical records. • These same capabilities might also be used to analyze audio recordings of patient conversations with IU Health call centers to further enhance a patient’s records. Such insights could help IU Health develop a more thorough understanding of the patient’s needs, and better illuminate how those patients utilize the health system’s services.
  • 40. IU HEALTH’S RX FOR MINING DARK DATA • Another opportunity involves using dark data to help predict need and manage care across populations. IU Health is examining how cognitive computing, external data, and patient data could help identify patterns of illness, health care access, and historical outcomes in local populations. The approaches could make it possible to incorporate socioeconomic factors that may affect patients’ engagement with health care providers. • “There may be a correlation between high density per living unit and disengagement from health,” says Mark Lantzy, senior vice president and chief information officer, IU Health. “It is promising that we can augment patient data with external data to determine how to better engage with people about their health. We are creating the underlying platform to uncover those correlations and are trying to create something more systemic. • The destination for our journey is an improved patient experience,” he continues. “Ultimately, we want it to drive better satisfaction and engagement. More than deliver great health care to individual patients, we want to improve population health throughout Indiana as well. To be able to impact that in some way, even incrementally, would be hugely beneficial.”
  • 41. Retailers make it personal • Retailers almost universally recognize that digital has reshaped customer behavior and shopping. In fact, $0.56 of every dollar spent in a store is influenced by a digital interaction. • Yet many retailers—particularly those with brick-and-mortar operations— still struggle to deliver the digital experiences customers expect. Some focus excessively on their competitors instead of their customers and rely on the same old key performance indicators and data. • In recent years, however, growing numbers of retailers have begun exploring different approaches to developing digital experiences. Some are analyzing previously dark data culled from customers’ digital lives and using the resulting insights to develop merchandising, marketing, customer service, and even product development strategies that offer shoppers a targeted and individualized customer experience.
  • 42. Retailers make it personal • Stitch Fix, for example, is an online subscription shopping service that uses images from social media and other sources to track emerging fashion trends and evolving customer preferences. • Its process begins with clients answering a detailed questionnaire about their tastes in clothing. Then, with client permission, the company’s team of 60 data scientists augments that information by scanning images on customers’ Pinterest boards and other social media sites, analyzing them, and using the resulting insights to a develop a deeper understanding of each customer’s sense of style. • Company stylists and artificial intelligence algorithms use these profiles to select style-appropriate items of clothing to be shipped to individual customers at regular intervals.
  • 43. Retailers make it personal • Meanwhile, grocery supermarket chain Kroger Co. is taking a different approach that leverages Internet of Things and advanced analytics techniques. As part of a pilot program, the company is embedding a network of sensors and analytics into store shelves that can interact with the Kroger app and a digital shopping list on a customer’s phone. • As the customer strolls down each aisle, the system—which contains a digital history of the customer’s purchases and product preferences—can spotlight specially priced products the customer may want on 4-inch displays mounted in the aisles. This pilot, which began in late 2016 with initial testing in 14 stores, is expected to expand in 2017.
  • 44. GREG POWERS, VICE PRESIDENT OF TECHNOLOGY, HALLIBURTON • Yet the sheer volume of information that we can and do collect goes way beyond human cognitive bandwidth. Advances in sensor science are delivering enormous troves of both dark data and what I think of as really dark data. • For example, we scan rocks electromagnetically to determine their consistency. We use nuclear magnetic resonance to perform what amounts to an MRI on oil wells. Neutron and gamma-ray analysis measures the electrical permittivity and conductivity of rock. Downhole spectroscopy measures fluids. Acoustic sensors collect 1–2 terabytes of data daily. • All of this dark data helps us better understand in-well performance. In fact, there’s so much potential value buried in this darkness that I flip the frame and refer to it as “bright data” that we have yet to tap.
  • 45. GREG POWERS, VICE PRESIDENT OF TECHNOLOGY, HALLIBURTON • In the next phase of Halliburton’s ongoing analytics program, we want to develop the capacity to capture, mine, and use bright data insights to become more predictive. • Given the nature of our operations, this will be no small task. Identical events driven by common circumstances are rare in the oil and gas industry. We have 30 years of retrospective data, but there are an infinite number of combinations of rock, gas, oil, and other variables that affect outcomes. • Unfortunately, there is no overarching constituent physics equation that can describe the right action to take for any situation encountered. Yet, even if we can’t explain what we’ve seen historically, we can explore what has happened and let our refined appreciation of historic data serve as a road map to where we can go. • In other words, we plan to correlate data to things that statistically seem to matter and, then, use this data to develop a confidence threshold to inform how we should approach these issues.
  • 46. GREG POWERS, VICE PRESIDENT OF TECHNOLOGY, HALLIBURTON • We believe that nontraditional data holds the key to creating advanced intelligent response capabilities to solve problems, potentially without human intervention, before they happen. • At the lowest level, we’ll take measurements and tell someone after the fact that something happened. At the next level, our goal will be to recognize that something has happened and, then, understand why it happened. The following step will use real-time monitoring to provide in-the-moment awareness of what is taking place and why. In the next tier, predictive tools will help us discern what’s likely to happen next. The most extreme offering will involve automating the response—removing human intervention from the equation entirely. • Drilling is complicated work. To make it more autonomous and efficient, and to free humans from mundane decision making, we need to work smarter. Our industry is facing a looming generational change. Experienced employees will soon retire and take with them decades of hard-won expertise and knowledge. We can’t just tell our new hires, “Hey, go read 300 terabytes of dark data to get up to speed.” We’re going to have to rely on new approaches for developing, managing, and sharing data-driven wisdom.
  • 47. Where do you start? Ask the right questions: • Rather than attempting to discover and inventory all of the dark data hidden within and outside your organization, work with business teams to identify specific questions they want answered. Work to identify potential dark analytics sources and the untapped opportunities contained therein. • Then focus your analytics efforts on those data streams and sources that are particularly relevant. • For example, if marketing wants to boost sales of sports equipment in a certain region, analytics teams can focus their efforts on real-time sales transaction streams, inventory, and product pricing data at select stores within the target region. They could then supplement this data with historic unstructured data—in-store video analysis of customer foot traffic, social sentiment, influencer behavior, or even pictures of displays or product placement across sites—to generate more nuanced insights.
  • 48. Look outside your organization: • You can augment your own data with publicly available demographic, location, and statistical information. Not only can this help your analytics teams generate more expansive, detailed reports—it can put insights in a more useful context. • For example, a physician makes recommendations to an asthma patient based on her known health history and a current examination. By reviewing local weather data, he can also provide short-term solutions to help her through a flare-up during pollen season. In another example, employers might analyze data from geospatial tools, traffic patterns, and employee turnover to determine the extent to which employee job satisfaction levels are being adversely impacted by commute times.
  • 49. Augment data talent: • Data scientists are an increasingly valuable resource, especially those who can artfully combine deep modeling and statistical techniques with industry or function-specific insights and creative problem framing. Going forward, those with demonstrable expertise in a few areas will likely be in demand. • For example, both machine learning and deep learning require programmatic expertise—the ability to build established patterns to determine the appropriate combination of data corpus and method to uncover reasonable, defensible insights. Likewise, visual and graphic design skills may be increasingly critical given that visually communicating results and explaining rationales are essential for broad organizational adoption. • Finally, traditional skills such as master data management and data architecture will be as valuable as ever—particularly as more companies begin laying the foundations they’ll need to meet the diverse, expansive, and exploding data needs of tomorrow.
  • 50. Explore advanced visualization tools: • Not everyone in your organization will be able to digest a printout of advanced Bayesian statistics and apply them to business practices. • Most people need to understand the “so what” and the “why” of complex analytical insights before they can turn insight into action. In many situations, information can be more easily digested when presented as an infographic, a dashboard, or another type of visual representation. • Visual and design software packages can do more than generate eye- catching graphics such as bubble charts, word clouds, and heat maps—they can boost business intelligence by repackaging big data into smaller, more meaningful chunks, delivering value to users much faster. Additionally, the insights (and the tools) can be made accessible across the enterprise, beyond the IT department, and to business users at all levels, to create more agile, cross-functional teams.
  • 51. View it as a business-driven effort: • It’s time to recognize analytics as an overall business strategy rather than as an IT function. To that end, work with C-suite colleagues to garner support for your dark analytics approach. • Many CEOs are making data a cornerstone of overall business strategy, which mandates more sophisticated techniques and accountability for more deliberate handling of the underlying assets. • By understanding your organization’s agenda and goals, you can determine the value that must be delivered, define the questions that should be asked, and decide how to harness available data to generate answers. • Data analytics then becomes an insight-driven advantage in the marketplace. The best way to help ensure buy-in is to first pilot a project that will demonstrate the tangible ROI that can be realized by the organization with a businesswide analytics strategy.
  • 52. Think broadly: • As you develop new capabilities and strategies, think about how you can extend them across the organization as well as to customers, vendors, and business partners. Your new data strategy becomes part of your reference architecture that others can use.
  • 53. Thank you and May the light shines on you
  • 54. References 1. http://www.kdnuggets.com/2015/11/importance-dark-data-big-data- world.html 2. http://www.kdnuggets.com/2015/01/shining-light-on-dark-data.html 3. http://www.kdnuggets.com/2016/03/rise-dark-data-how- harnessed.html 4. http://www.kdnuggets.com/solutions/fraud-detection.html 5. https://en.wikipedia.org/wiki/Operational_database 6. http://blog.syncsort.com/2017/05/big-data/4-dark-data-examples-use- cases/ 7. Tracie Kambies, Paul Roma, Nitin Mittal, Sandeep Kumar Sharma, https://dupress.deloitte.com/dup-us-en/focus/tech-trends/2017/dark- data-analyzing-unstructured-data.html