SlideShare uma empresa Scribd logo
1 de 2
In CNSwe are developingaholisticartificial intelligence cybersystem.The systemwill be composed
of sevenintertwinedmodules/elementsincluding;Memory,Sensation,Perception,Reasoning,
Thought,Consciousness,DecisionMakingandAction.
We are well aware the taskis monumental andimprobable andthat’sexactlythe point.InCNSwe
believethatalongthe processof tryingto make the impossiblepossible,we must;stretchourminds
to theirlimit,thinkunlike anyone else,be original &inventive anddevice solutionsthatwill
challenge the boundariesof whatwe know.
Case study:Big Data
Big Data on a tea spoon:
Big data isa set of approaches,methodsandtoolsthatrequire new waystouncovercritical hidden
informationfromlarge datasetsof massive scale.Bigdatausuallyincludesdatasetswithsizes
beyondthe abilityof commonlyusedtoolstoprocessandanalyse the datawithina practical and
acceptable leadtime.Bigdataisgrowingfast,since 2012 data grew fromtensof terabytesto
petabytestoday.
Challengesof BigData:
The keyproblemof BigData is;that it’sgrowingfasterthanMore's law for computationspeed.This
problemwill onlygetworse inyearstocome inparticularwiththe nextgenerationof challenges
such as; gene sequencers,NMRimaging,social media,the internetof everythingandfuture
unknowns.It’simportanttonote,thatwhendealingwithBigData,there are twocrucial challenges;
The firstis identifyrobustmethodstoextractcritical neededinformationfromthe BigDatasetor to
put thisinLehmanterms,findinganeedle inahaystack.The secondchallenge is,todevelop
solutionsthatwill enable fastcomputationof BigData and inparticular,whendatais growingfaster
than the computationrate.To deal withthese defieseffectively,anyproposedsolutionsandtools
shouldbe able totransformBig Data setsintoSmall Data setswhile retainingall the relevant
informationandideallyeliminatingdatanoise.
How to transformBigData setsintoSmall Data setswhile retainingall the information
One of the mosteffective andwell establishedapproachestodeal withBigDatais knownas
“Statistics”. A good representative sample of the BigDataset,in conjunctionwiththe correctuse of
statistical methodsandtools,are capable toextractvital informationtoanswerourquestions,
withinaconfidence level andmarginof error.
But whathappenswhenstatisticsare notthe appropriate approachor the typologyof the problem
isnot suitedforstatistical methods?
We inCNShave developedagroundbreakingmethodandthe toolswhichundercertainconditions
(Ill-problems) canreduce BigDatasize by the square root of the data set dimension(i.e.asetof
10^9 data recordsis reducedto~10^3) enablingtovaporize the haystack(BigDataset) while leaving
the needle (Information) intactandfree of noise.The innovative methodandtoolshave beentested
and the proof of concepthas beenestablished.The mathematical approachandproposed
algorithmsproduce informationreconstructionsof greaterqualitythananyotherexistingmethod,
but at a cost of convergence time (oneoff).Howeveronce the datahas beentransformed,the
manipulationandanalysistime isreducedsignificantly,ourexperimental resultsshoweda reduction
inprocessingtime bya factor of 50. Anotherpronouncedbenefitof thisapproachisthe abilityto
reconstructthe informationwithahighlevel of quality&completeness,regardlessof the data
structure or data size (greatnewsforcloudcomputing).Althoughwe have achievednotable results
inthe testssofar, additional experimentsare plannedtofurthersolidifythe validationof this
innovative andbreakthroughapproach.
For our tests,we useddatafrom NMR experiments,andwere consistentlyable toreduce the
original datasetsfroman average of 750Gb to an average of 0.045 Gb a factor ~10^3 withoutlossof
information,whileeliminatingthe datanoise.Atpresent,we are workingatimprovingthe method
and reduce the data setssize evenfurther.A paperwiththe preliminaryresultswill be publishedby
endof Julythisyear.

Mais conteúdo relacionado

Mais procurados

Community-Assisted Software Engineering Decision Making
Community-Assisted Software Engineering Decision MakingCommunity-Assisted Software Engineering Decision Making
Community-Assisted Software Engineering Decision Makinggregoryg
 
Big, small or just complex data?
Big, small or just complex data?Big, small or just complex data?
Big, small or just complex data?panoratio
 
Top 10 areas of expertise in data science
Top 10 areas of expertise in data scienceTop 10 areas of expertise in data science
Top 10 areas of expertise in data scienceGlobalTechCouncil
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattooMohamed Magdy
 
Big data analytics and large-scale computers
Big data analytics and large-scale computersBig data analytics and large-scale computers
Big data analytics and large-scale computersShubhamKhurana20
 
When Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic HappensWhen Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic HappensInfini Graph
 
Brochure_Big-Data_Offerings
Brochure_Big-Data_OfferingsBrochure_Big-Data_Offerings
Brochure_Big-Data_OfferingsAnisha Lamba
 
TCS Point of View Session - Analyze by Dr. Gautam Shroff, VP and Chief Scient...
TCS Point of View Session - Analyze by Dr. Gautam Shroff, VP and Chief Scient...TCS Point of View Session - Analyze by Dr. Gautam Shroff, VP and Chief Scient...
TCS Point of View Session - Analyze by Dr. Gautam Shroff, VP and Chief Scient...Tata Consultancy Services
 
Ml in a day v 1.1
Ml in a day v 1.1Ml in a day v 1.1
Ml in a day v 1.1CCG
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance Qubole
 
Data architecture A Primer for the Data Scientist
Data architecture A Primer for the Data ScientistData architecture A Primer for the Data Scientist
Data architecture A Primer for the Data ScientistMary Levins, PMP
 
Graduating Year Career S.O.P.
Graduating Year Career S.O.P.Graduating Year Career S.O.P.
Graduating Year Career S.O.P.Akash Chatterjee
 

Mais procurados (20)

Community-Assisted Software Engineering Decision Making
Community-Assisted Software Engineering Decision MakingCommunity-Assisted Software Engineering Decision Making
Community-Assisted Software Engineering Decision Making
 
Big, small or just complex data?
Big, small or just complex data?Big, small or just complex data?
Big, small or just complex data?
 
Top 10 areas of expertise in data science
Top 10 areas of expertise in data scienceTop 10 areas of expertise in data science
Top 10 areas of expertise in data science
 
Big data in action
Big data in actionBig data in action
Big data in action
 
Using Big Data Analytics
Using Big Data AnalyticsUsing Big Data Analytics
Using Big Data Analytics
 
Data science
Data scienceData science
Data science
 
Collaborative design for data driven projects
Collaborative design for data driven projectsCollaborative design for data driven projects
Collaborative design for data driven projects
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattoo
 
Big data analytics and large-scale computers
Big data analytics and large-scale computersBig data analytics and large-scale computers
Big data analytics and large-scale computers
 
When Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic HappensWhen Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic Happens
 
Brochure_Big-Data_Offerings
Brochure_Big-Data_OfferingsBrochure_Big-Data_Offerings
Brochure_Big-Data_Offerings
 
Big Data Fud
Big Data FudBig Data Fud
Big Data Fud
 
TCS Point of View Session - Analyze by Dr. Gautam Shroff, VP and Chief Scient...
TCS Point of View Session - Analyze by Dr. Gautam Shroff, VP and Chief Scient...TCS Point of View Session - Analyze by Dr. Gautam Shroff, VP and Chief Scient...
TCS Point of View Session - Analyze by Dr. Gautam Shroff, VP and Chief Scient...
 
Data science
Data scienceData science
Data science
 
Data Science
Data ScienceData Science
Data Science
 
Ml in a day v 1.1
Ml in a day v 1.1Ml in a day v 1.1
Ml in a day v 1.1
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
 
Data architecture A Primer for the Data Scientist
Data architecture A Primer for the Data ScientistData architecture A Primer for the Data Scientist
Data architecture A Primer for the Data Scientist
 
Graduating Year Career S.O.P.
Graduating Year Career S.O.P.Graduating Year Career S.O.P.
Graduating Year Career S.O.P.
 

Semelhante a CNS and Big Data

A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremAnthonyOtuonye
 
Unit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptxUnit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptxvipulkondekar
 
TTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueTTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueMehmet Beyaz
 
An Efficient Approach for Clustering High Dimensional Data
An Efficient Approach for Clustering High Dimensional DataAn Efficient Approach for Clustering High Dimensional Data
An Efficient Approach for Clustering High Dimensional DataIJSTA
 
Map Reduce in Big fata
Map Reduce in Big fataMap Reduce in Big fata
Map Reduce in Big fataSuraj Sawant
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAmpoolIO
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmIRJET Journal
 
Data Mining vs Statistics
Data Mining vs StatisticsData Mining vs Statistics
Data Mining vs StatisticsAndry Alamsyah
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...IJSRD
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...IJSRD
 
A Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data ScienceA Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data Scienceijtsrd
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data scienceJordan Engbers
 
International Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxInternational Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxvrickens
 
International Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxInternational Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxdoylymaura
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)NikitaRajbhoj
 
Data Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsData Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsVaishali Pal
 
Review on the Ted Talk- What do we do with all this big data?
Review on the Ted Talk- What do we do with all this big data?Review on the Ted Talk- What do we do with all this big data?
Review on the Ted Talk- What do we do with all this big data?TanayKarnik1
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerMicrosoft
 

Semelhante a CNS and Big Data (20)

A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE Theorem
 
Unit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptxUnit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptx
 
TTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueTTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining Technique
 
An Efficient Approach for Clustering High Dimensional Data
An Efficient Approach for Clustering High Dimensional DataAn Efficient Approach for Clustering High Dimensional Data
An Efficient Approach for Clustering High Dimensional Data
 
Map Reduce in Big fata
Map Reduce in Big fataMap Reduce in Big fata
Map Reduce in Big fata
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big Data
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic Algorithm
 
Data Mining vs Statistics
Data Mining vs StatisticsData Mining vs Statistics
Data Mining vs Statistics
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
 
A Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data ScienceA Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data Science
 
Big data upload
Big data uploadBig data upload
Big data upload
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data science
 
International Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxInternational Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docx
 
International Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxInternational Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docx
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)
 
Data Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsData Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and Innovations
 
Review on the Ted Talk- What do we do with all this big data?
Review on the Ted Talk- What do we do with all this big data?Review on the Ted Talk- What do we do with all this big data?
Review on the Ted Talk- What do we do with all this big data?
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringer
 

CNS and Big Data

  • 1. In CNSwe are developingaholisticartificial intelligence cybersystem.The systemwill be composed of sevenintertwinedmodules/elementsincluding;Memory,Sensation,Perception,Reasoning, Thought,Consciousness,DecisionMakingandAction. We are well aware the taskis monumental andimprobable andthat’sexactlythe point.InCNSwe believethatalongthe processof tryingto make the impossiblepossible,we must;stretchourminds to theirlimit,thinkunlike anyone else,be original &inventive anddevice solutionsthatwill challenge the boundariesof whatwe know. Case study:Big Data Big Data on a tea spoon: Big data isa set of approaches,methodsandtoolsthatrequire new waystouncovercritical hidden informationfromlarge datasetsof massive scale.Bigdatausuallyincludesdatasetswithsizes beyondthe abilityof commonlyusedtoolstoprocessandanalyse the datawithina practical and acceptable leadtime.Bigdataisgrowingfast,since 2012 data grew fromtensof terabytesto petabytestoday. Challengesof BigData: The keyproblemof BigData is;that it’sgrowingfasterthanMore's law for computationspeed.This problemwill onlygetworse inyearstocome inparticularwiththe nextgenerationof challenges such as; gene sequencers,NMRimaging,social media,the internetof everythingandfuture unknowns.It’simportanttonote,thatwhendealingwithBigData,there are twocrucial challenges; The firstis identifyrobustmethodstoextractcritical neededinformationfromthe BigDatasetor to put thisinLehmanterms,findinganeedle inahaystack.The secondchallenge is,todevelop solutionsthatwill enable fastcomputationof BigData and inparticular,whendatais growingfaster than the computationrate.To deal withthese defieseffectively,anyproposedsolutionsandtools shouldbe able totransformBig Data setsintoSmall Data setswhile retainingall the relevant informationandideallyeliminatingdatanoise. How to transformBigData setsintoSmall Data setswhile retainingall the information One of the mosteffective andwell establishedapproachestodeal withBigDatais knownas “Statistics”. A good representative sample of the BigDataset,in conjunctionwiththe correctuse of statistical methodsandtools,are capable toextractvital informationtoanswerourquestions, withinaconfidence level andmarginof error. But whathappenswhenstatisticsare notthe appropriate approachor the typologyof the problem isnot suitedforstatistical methods? We inCNShave developedagroundbreakingmethodandthe toolswhichundercertainconditions (Ill-problems) canreduce BigDatasize by the square root of the data set dimension(i.e.asetof 10^9 data recordsis reducedto~10^3) enablingtovaporize the haystack(BigDataset) while leaving the needle (Information) intactandfree of noise.The innovative methodandtoolshave beentested and the proof of concepthas beenestablished.The mathematical approachandproposed algorithmsproduce informationreconstructionsof greaterqualitythananyotherexistingmethod,
  • 2. but at a cost of convergence time (oneoff).Howeveronce the datahas beentransformed,the manipulationandanalysistime isreducedsignificantly,ourexperimental resultsshoweda reduction inprocessingtime bya factor of 50. Anotherpronouncedbenefitof thisapproachisthe abilityto reconstructthe informationwithahighlevel of quality&completeness,regardlessof the data structure or data size (greatnewsforcloudcomputing).Althoughwe have achievednotable results inthe testssofar, additional experimentsare plannedtofurthersolidifythe validationof this innovative andbreakthroughapproach. For our tests,we useddatafrom NMR experiments,andwere consistentlyable toreduce the original datasetsfroman average of 750Gb to an average of 0.045 Gb a factor ~10^3 withoutlossof information,whileeliminatingthe datanoise.Atpresent,we are workingatimprovingthe method and reduce the data setssize evenfurther.A paperwiththe preliminaryresultswill be publishedby endof Julythisyear.