SlideShare uma empresa Scribd logo
1 de 57
Baixar para ler offline
Maryann	
  E.	
  	
  Martone,	
  Ph.	
  D.	
  
University	
  of	
  California,	
  San	
  Diego	
  
Neuroscience	
  is	
  unlikely	
  to	
  be	
  served	
  by	
  
a	
  few	
  large	
  databases	
  like	
  the	
  genomics	
  
and	
  proteomics	
  community	
  
Whole	
  brain	
  data	
  
(20	
  um	
  
microscopic	
  MRI)	
  
Mosiac	
  LM	
  
images	
  (1	
  GB+)	
  
ConvenNonal	
  LM	
  
images	
  
Individual	
  cell	
  
morphologies	
  
EM	
  volumes	
  &	
  
reconstrucNons	
  
Solved	
  molecular	
  
structures	
  
No	
  single	
  technology	
  serves	
  
these	
  all	
  equally	
  well.	
  
 Mul6ple	
  data	
  types;	
  	
  
mul6ple	
  scales;	
  	
  mul6ple	
  
databases	
  
hPp://neuinfo.org	
  
•  NIF’s	
  mission	
  is	
  to	
  maximize	
  the	
  awareness	
  of,	
  access	
  to	
  
and	
  uNlity	
  of	
  research	
  resources	
  produced	
  worldwide	
  to	
  
enable	
  bePer	
  science	
  and	
  promote	
  efficient	
  use	
  
–  NIF	
  unites	
  neuroscience	
  informaNon	
  without	
  respect	
  to	
  domain,	
  
funding	
  agency,	
  insNtute	
  or	
  community	
  
–  NIF	
  is	
  like	
  a	
  “Pub	
  Med”	
  for	
  all	
  biomedical	
  resources	
  and	
  a	
  “Pub	
  
Med	
  Central”	
  for	
  databases	
  
–  Makes	
  them	
  searchable	
  from	
  a	
  single	
  interface	
  
–  PracNcal	
  and	
  cost-­‐effecNve;	
  	
  tries	
  to	
  be	
  sensible	
  
–  Learned	
  a	
  lot	
  about	
  current	
  data	
  prac6ces	
  
The	
  Neuroscience	
  InformaNon	
  Framework	
  is	
  an	
  iniNaNve	
  of	
  the	
  
NIH	
  Blueprint	
  consorNum	
  of	
  insNtutes	
  	
  	
  	
  hPp://neuinfo.org	
  
h=p://neuinfo.org	
  
June10,	
  2013	
   dkCOIN	
  InvesNgator's	
  Retreat	
   6	
  
•  A	
  portal	
  for	
  finding	
  and	
  using	
  
neuroscience	
  resources	
  
  A	
  consistent	
  framework	
  for	
  
describing	
  resources	
  
  Provides	
  simultaneous	
  
search	
  of	
  mulNple	
  types	
  of	
  
informaNon,	
  organized	
  by	
  
category	
  
  Supported	
  by	
  an	
  expansive	
  
ontology	
  for	
  neuroscience	
  
  UNlizes	
  advanced	
  
technologies	
  to	
  search	
  the	
  
“hidden	
  web”	
  
UCSD,	
  Yale,	
  Cal	
  Tech,	
  George	
  Mason,	
  Washington	
  Univ	
  
Literature	
  
Database	
  
FederaNon	
  
Registry	
  
We’d	
  like	
  to	
  be	
  able	
  to	
  find:	
  
•  What	
  is	
  known****:	
  
–  What	
  are	
  the	
  projecNons	
  of	
  hippocampus?	
  
–  Is	
  GRM1	
  expressed	
  In	
  cerebral	
  cortex?	
  
–  What	
  genes	
  have	
  been	
  found	
  to	
  be	
  upregulated	
  in	
  
chronic	
  drug	
  abuse	
  in	
  adults	
  
–  What	
  animal	
  models	
  have	
  similar	
  phenotypes	
  to	
  
Parkinson’s	
  disease?	
  
–  What	
  studies	
  used	
  my	
  polyclonal	
  anNbody	
  against	
  
GABA	
  in	
  humans?	
  
•  What	
  is	
  not	
  known:	
  
–  ConnecNons	
  among	
  data	
  
–  Gaps	
  in	
  knowledge	
  
A	
  framework	
  makes	
  it	
  easier	
  to	
  address	
  these	
  quesNons	
  
With	
  the	
  thousands	
  of	
  databases	
  and	
  other	
  informaNon	
  sources	
  
available,	
  simple	
  descripNve	
  metadata	
  will	
  not	
  suffice	
  
• NIF	
  curators	
  
• NominaNon	
  by	
  the	
  
community	
  
• Semi-­‐automated	
  text	
  
mining	
  pipelines	
  
 NIF	
  Registry	
  
 Requires	
  no	
  special	
  
skills	
  
 Site	
  map	
  available	
  
for	
  local	
  hosNng	
  
• NIF	
  Data	
  FederaNon	
  
• DISCO	
  interop	
  
• Requires	
  some	
  
programming	
  skill	
  
• Open	
  Source	
  Brain	
  <	
  
2	
  hr	
  
Two	
  Nered	
  system:	
  	
  low	
  barrier	
  to	
  entry	
  
Current	
  
Planned	
  
DISCO	
  Dashboard	
  Func6ons	
  
•  Ingest	
  Script	
  Manager	
  
•  Public	
  Script	
  Repository	
  
•  Data	
  &	
  Event	
  Tracker	
  
•  Versioning	
  System	
  
•  Curator	
  Tool	
  	
  
•  Data	
  Transformer	
  Manager	
  
June10,	
  2013	
   dkCOIN	
  InvesNgator's	
  Retreat	
   11	
  Luis	
  Marenco,	
  Rixin	
  Wang,	
  Perrry	
  Miller,	
  Gordon	
  Shepherd	
  
Yale	
  University	
  
NIF	
  was	
  designed	
  to	
  be	
  populated	
  rapidly	
  
with	
  progressive	
  refinement	
  
Databases	
  come	
  in	
  many	
  shapes	
  and	
  sizes	
  
•  Primary	
  data:	
  
–  Data	
  available	
  for	
  reanalysis,	
  e.g.,	
  
microarray	
  data	
  sets	
  from	
  GEO;	
  	
  
brain	
  images	
  from	
  XNAT;	
  	
  
microscopic	
  images	
  (CCDB/CIL)	
  
•  Secondary	
  data	
  
–  Data	
  features	
  extracted	
  through	
  
data	
  processing	
  and	
  someNmes	
  
normalizaNon,	
  e.g,	
  brain	
  structure	
  
volumes	
  (IBVD),	
  gene	
  expression	
  
levels	
  (Allen	
  Brain	
  Atlas);	
  	
  brain	
  
connecNvity	
  statements	
  (BAMS)	
  
•  TerNary	
  data	
  
–  Claims	
  and	
  asserNons	
  about	
  the	
  
meaning	
  of	
  data	
  
•  E.g.,	
  gene	
  upregulaNon/
downregulaNon,	
  brain	
  
acNvaNon	
  as	
  a	
  funcNon	
  of	
  task	
  
•  Registries:	
  
–  Metadata	
  
–  Pointers	
  to	
  data	
  sets	
  or	
  
materials	
  stored	
  elsewhere	
  
•  Data	
  aggregators	
  
–  Aggregate	
  data	
  of	
  the	
  same	
  
type	
  from	
  mulNple	
  sources,	
  
e.g.,	
  Cell	
  Image	
  
Library	
  ,SUMSdb,	
  Brede	
  
•  Single	
  source	
  
–  Data	
  acquired	
  within	
  a	
  single	
  
context	
  ,	
  e.g.,	
  Allen	
  Brain	
  Atlas	
  
Researchers	
  are	
  producing	
  a	
  variety	
  of	
  
informaNon	
  arNfacts	
  using	
  a	
  mulNtude	
  of	
  
technologies	
  
Hippocampus	
  OR	
  “Cornu	
  Ammonis”	
  OR	
  
“Ammon’s	
  horn”	
   Query	
  expansion:	
  	
  Synonyms	
  
and	
  related	
  concepts	
  
Boolean	
  queries	
  
Data	
  sources	
  
categorized	
  by	
  
“data	
  type”	
  and	
  
level	
  of	
  nervous	
  
system	
  
Common	
  views	
  
across	
  mulNple	
  
sources	
  
Tutorials	
  for	
  using	
  
full	
  resource	
  when	
  
geong	
  there	
  from	
  
NIF	
  
Link	
  back	
  to	
  
record	
  in	
  
original	
  source	
  
Connects	
  to	
  
Synapsed	
  with	
  
Synapsed	
  by	
  
Input	
  region	
  
innervates	
  
Axon	
  innervates	
  
Projects	
  to	
  Cellular	
  contact	
  
Subcellular	
  contact	
  
Source	
  site	
  
Target	
  	
  site	
  
Each	
  resource	
  implements	
  a	
  different,	
  though	
  related	
  model;	
  	
  
systems	
  are	
  complex	
  and	
  difficult	
  to	
  learn,	
  in	
  many	
  cases	
  
•  You	
  (and	
  the	
  machine)	
  have	
  to	
  be	
  able	
  to	
  find	
  it	
  
–  Accessible	
  through	
  the	
  web	
  
–  Structured	
  or	
  semi-­‐structured	
  
–  AnnotaNons	
  
•  You	
  (and	
  the	
  machine)	
  	
  have	
  to	
  be	
  able	
  to	
  use	
  it	
  
–  Data	
  type	
  specified	
  and	
  in	
  an	
  acNonable	
  form	
  
•  You	
  (and	
  the	
  machine)	
  have	
  to	
  know	
  what	
  the	
  data	
  
mean	
  
•  SemanNcs	
  
•  Context:	
  	
  Experimental	
  metadata	
  
•  Provenance:	
  	
  where	
  did	
  they	
  come	
  from	
  
Knowledge	
  in	
  space	
  and	
  spaNal	
  relaNonships	
  
(the	
  “where”)	
  
Knowledge	
  in	
  words,	
  terminologies	
  and	
  
logical	
  relaNonships	
  (the	
  “what”)	
  
Purkinje	
  
Cell	
  
Axon	
  
Terminal	
  
Axon	
  
DendriNc	
  
Tree	
  
DendriNc	
  
Spine	
  
Dendrite	
  
Cell	
  body	
  
Cerebellar	
  
cortex	
  
There	
  is	
  liPle	
  obvious	
  connecNon	
  between	
  
data	
  sets	
  taken	
  at	
  different	
  scales	
  using	
  
different	
  microscopies	
  without	
  an	
  explicit	
  
representaNon	
  of	
  the	
  biological	
  objects	
  that	
  
the	
  data	
  represent	
  
•  NIF	
  covers	
  mulNple	
  structural	
  scales	
  and	
  domains	
  of	
  relevance	
  to	
  neuroscience	
  
•  Aggregate	
  of	
  community	
  ontologies	
  with	
  some	
  extensions	
  for	
  neuroscience,	
  e.g.,	
  Gene	
  
Ontology,	
  Chebi,	
  Protein	
  Ontology	
  
NIFSTD	
  
Organism	
  
NS	
  FuncNon	
  Molecule	
   InvesNgaNon	
  
Subcellular	
  
structure	
  
Macromolecule	
   Gene	
  
Molecule	
  Descriptors	
  
Techniques	
  
Reagent	
   Protocols	
  
Cell	
  
Resource	
   Instrument	
  
DysfuncNon	
   Quality	
  
Anatomical	
  
Structure	
  
Brain	
  
Cerebellum	
  
Purkinje	
  Cell	
  Layer	
  
Purkinje	
  cell	
  
neuron	
  
has	
  a	
  
has	
  a	
  
has	
  a	
  
is	
  a	
  
•  Ontology:	
  an	
  explicit,	
  formal	
  
representaNon	
  of	
  concepts	
  	
  
relaNonships	
  among	
  them	
  within	
  
a	
  parNcular	
  domain	
  that	
  
expresses	
  human	
  knowledge	
  in	
  a	
  
machine	
  readable	
  form	
  
•  Branch	
  of	
  philosophy:	
  	
  a	
  theory	
  
of	
  what	
  is	
  
•  e.g.,	
  Gene	
  ontologies	
  
•  Express	
  neuroscience	
  concepts	
  in	
  a	
  way	
  that	
  is	
  machine	
  readable	
  	
  
–  Synonyms,	
  lexical	
  variants	
  
–  DefiniNons	
  
•  Provide	
  means	
  of	
  disambiguaNon	
  of	
  strings	
  
–  Nucleus	
  part	
  of	
  cell;	
  	
  nucleus	
  part	
  of	
  brain;	
  	
  nucleus	
  part	
  of	
  atom	
  
•  Rules	
  by	
  which	
  a	
  class	
  is	
  defined,	
  e.g.,	
  a	
  GABAergic	
  neuron	
  is	
  neuron	
  that	
  releases	
  GABA	
  as	
  a	
  
neurotransmiPer	
  
•  ProperNes	
  
–  Support	
  reasoning	
  
•  Provide	
  universals	
  for	
  navigaNng	
  across	
  different	
  data	
  sources	
  
–  SemanNc	
  “index”	
  
–  Link	
  data	
  through	
  relaNonships	
  not	
  just	
  one-­‐to-­‐one	
  mappings	
  
•  Provide	
  the	
  basis	
  for	
  concept-­‐based	
  queries	
  to	
  probe	
  and	
  mine	
  data	
  
•  Establish	
  a	
  semanNc	
  framework	
  for	
  landscape	
  analysis	
  
MathemaNcs,	
  Computer	
  code	
  or	
  Esperanto	
  
birnlex_1732	
   Brodmann.1	
  
Explicit	
  mapping	
  of	
  database	
  content	
  helps	
  disambiguate	
  non-­‐unique	
  and	
  custom	
  
terminology	
  
June10,	
  2013	
   24	
  
Aligns	
  sources	
  to	
  the	
  NIF	
  semanNc	
  framework	
  
•  Search	
  Google:	
  	
  GABAergic	
  neuron	
  
•  Search	
  NIF:	
  	
  GABAergic	
  neuron	
  
–  NIF	
  automaNcally	
  searches	
  for	
  types	
  of	
  
GABAergic	
  neurons	
  
Types	
  of	
  GABAergic	
  
neurons	
  
Search by meaning not by string
Equivalence	
  classes;	
  	
  restricNons	
  
Arbitrary	
  but	
  defensible	
  
• Neurons	
  classified	
  by	
  
• Circuit	
  role:	
  	
  principal	
  neuron	
  vs	
  
interneuron	
  
• Molecular	
  consNtuent:	
  	
  Parvalbumin-­‐
neurons,	
  calbindin-­‐neurons	
  
• Brain	
  region:	
  	
  Cerebellar	
  neuron	
  
• Morphology:	
  	
  Spiny	
  neuron	
  
• 	
  Molecule	
  Roles:	
  	
  Drug	
  of	
  abuse,	
  anterograde	
  
tracer,	
  retrograde	
  tracer	
  
• Brain	
  parts:	
  	
  Circumventricular	
  organ	
  
• Organisms:	
  	
  Non-­‐human	
  primate,	
  non-­‐human	
  
vertebrate	
  
• QualiNes:	
  	
  Expression	
  level	
  
• Techniques:	
  	
  Neuroimaging	
  
What	
  genes	
  are	
  upregulated	
  by	
  drugs	
  of	
  abuse	
  in	
  the	
  
adult	
  mouse?	
  (show	
  me	
  the	
  data!)	
  
Morphine	
  
Increased	
  
expression	
  
Adult	
  Mouse	
  
• NIF	
  ConnecNvity:	
  	
  7	
  databases	
  containing	
  connecNvity	
  primary	
  data	
  or	
  claims	
  
from	
  literature	
  on	
  connecNvity	
  between	
  brain	
  regions	
  
• Brain	
  Architecture	
  Management	
  System	
  (rodent)	
  
• Temporal	
  lobe.com	
  (rodent)	
  
• Connectome	
  Wiki	
  (human)	
  
• Brain	
  Maps	
  (various)	
  
• CoCoMac	
  (primate	
  cortex)	
  
• UCLA	
  MulNmodal	
  database	
  (Human	
  fMRI)	
  
• Avian	
  Brain	
  ConnecNvity	
  Database	
  (Bird)	
  
• Total:	
  	
  1800	
  unique	
  brain	
  terms	
  (excluding	
  Avian)	
  
• Number	
  of	
  exact	
  terms	
  used	
  in	
  >	
  1	
  database:	
  	
  42	
  
• Number	
  of	
  synonym	
  matches:	
  	
  99	
  
• Number	
  of	
  1st	
  order	
  partonomy	
  matches:	
  	
  385	
  
hPp://neurolex.org	
  
• SemanNc	
  MediWiki	
  
• Provide	
  a	
  simple	
  interface	
  
for	
  defining	
  the	
  concepts	
  
required	
  
• Light	
  weight	
  semanNcs	
  
• Good	
  teaching	
  tool	
  for	
  
learning	
  about	
  semanNc	
  
integraNon	
  and	
  the	
  benefits	
  of	
  
a	
  consistent	
  semanNc	
  
framework	
  
• Community	
  based:	
  
• Anyone	
  can	
  contribute	
  their	
  
terms,	
  concepts,	
  things	
  
• Anyone	
  can	
  edit	
  
• Anyone	
  can	
  link	
  
• Accessible:	
  	
  searched	
  by	
  Google	
  
• Growing	
  into	
  a	
  significant	
  
knowledge	
  base	
  for	
  
neuroscience	
  
• InternaNonal	
  NeuroinformaNcs	
  
CoordinaNng	
  Facility	
  	
  
Demo	
  	
  D03	
  
Larson	
  et	
  al,	
  FronNers	
  in	
  NeuroinformaNcs,	
  in	
  press	
  
•  Neurolex	
  provides	
  an	
  
on-­‐line	
  computable	
  
index	
  for	
  expressing	
  
models	
  in	
  semanNc	
  
terms,	
  and	
  linking	
  to	
  
other	
  knowledge	
  and	
  
data	
  
•  Implemented	
  forms	
  
for	
  certain	
  types	
  of	
  
enNNes	
  
•  Neuroscience	
  
knowledge	
  in	
  the	
  web	
  
Pages	
  are	
  linked	
  through	
  properNes;	
  	
  Knowledge-­‐base	
  built	
  through	
  cross-­‐
modular	
  relaNons	
  and	
  links	
  to	
  data;	
  	
  red	
  links	
  
•  >	
  1000	
  Dicom	
  Terms	
  
–  Karl	
  Helmer	
  
–  Data	
  Sharing	
  Task	
  Force	
  
•  Tasks	
  and	
  CogniNve	
  Concepts	
  
from	
  CogniNve	
  Atlas	
  
–  Russ	
  Poldrack	
  
•  >280	
  Neurons	
  
–  Gordon	
  Shepherd	
  and	
  30	
  world	
  
wide	
  experts	
  
•  ~500	
  fly	
  neurons	
  from	
  Fly	
  
Anatomy	
  Ontology	
  
–  David	
  Osumi-­‐Sutherland	
  
•  >1200	
  Brain	
  parcellaNons	
  
`20,000	
  concepts:	
  	
  	
  Spreadsheet	
  downloads,	
  through	
  NIF	
  Web	
  Services,	
  
SPARQL	
  endpoint	
  
 200,000	
  
edits	
  
 150	
  
contributors	
  
Because	
  they	
  are	
  staNc	
  URL’s,	
  Wikis	
  are	
  searchable	
  by	
  
Google	
  
Neurolex:	
  	
  >	
  1	
  million	
  triples
Dr.	
  Yi	
  Zeng:	
  	
  Chinese	
  neural	
  knowledge	
  base	
  
NIF	
  Cell	
  Graph	
  
1.  Look	
  brain	
  region	
  up	
  in	
  NeuroLex	
  
2.  Look	
  up	
  cells	
  contained	
  in	
  the	
  brain	
  
region	
  
3.  Find	
  those	
  cells	
  that	
  are	
  known	
  to	
  project	
  
out	
  of	
  that	
  brain	
  region	
  
4.  Look	
  up	
  the	
  neurotransmiPers	
  for	
  those	
  
cells	
  
5.  Determine	
  whether	
  those	
  
neurotransmiPers	
  are	
  known	
  to	
  be	
  
excitatory	
  or	
  inhibitory	
  
6.  Report	
  the	
  projecNon	
  as	
  excitatory	
  or	
  
inhibitory,	
  and	
  report	
  the	
  enNre	
  chain	
  of	
  
logic	
  with	
  links	
  back	
  to	
  the	
  wiki	
  pages	
  
where	
  they	
  were	
  made	
  
7.  Make	
  sure	
  user	
  can	
  get	
  back	
  to	
  each	
  
statement	
  in	
  the	
  logic	
  chain	
  to	
  edit	
  it	
  if	
  
they	
  think	
  it	
  is	
  wrong	
  
Stephen	
  Larson	
   CHEBI:18243	
  
Are	
  projecNons	
  from	
  the	
  VTA	
  excitatory	
  
or	
  inhibitory?	
  
•  INCF	
  Project	
  
–  Neuron	
  Registry	
  
–  >	
  30	
  experts	
  
worldwide	
  
–  Fill	
  out	
  neuron	
  
pages	
  in	
  Neurolex	
  
Wiki	
  
–  Led	
  by	
  Dr.	
  Gordon	
  
Shepherd	
  
Soma	
  locaNon	
  
Dendrite	
  locaNon	
  
Axon	
  locaNon	
  
0	
  
50	
  
100	
  
150	
  
200	
  
250	
  
300	
  
Number	
  
Total	
  
redlinks	
  
easy	
  fixes	
  
hard	
  fixes	
  
Soma	
  locaNon	
  
Dendrite	
  locaNon	
  
Axon	
  locaNon	
  
Social	
  networks	
  and	
  community	
  sites	
  let	
  us	
  learn	
  things	
  from	
  the	
  
collecNve	
  behavior	
  of	
  contributors	
  
37	
  
neurolex.org: Semantic Wiki
• INCF Community encyclopedia
• Define all vocabulary, terms,
protocols, brain structures, diseases,
etc
• Living review articles
• Links to data, models and literature
• Semantic organization, search,
analysis and integration
• Searchable via the web
• Global directory of all shared
vocabularies, CDEs, etc
Slide	
  courtesy	
  of	
  Sean	
  Hill:	
  	
  InternaNonal	
  NeuroinformaNcs	
  CoordinaNng	
  Facility	
  
MarNn	
  Telefont,	
  HBP:	
  	
  Lab	
  Space	
  connecNng	
  to	
  Knowledge	
  Space	
  
•  NIF	
  can	
  be	
  used	
  to	
  survey	
  the	
  
data	
  landscape	
  
•  Analysis	
  of	
  NIF	
  shows	
  mulNple	
  
databases	
  with	
  similar	
  scope	
  
and	
  content	
  
•  Many	
  contain	
  parNally	
  
overlapping	
  data	
  
•  Data	
  “flows”	
  from	
  one	
  
resource	
  to	
  the	
  next	
  
–  Data	
  is	
  reinterpreted,	
  reanalyzed	
  or	
  
added	
  to	
  
•  Is	
  duplicaNon	
  good	
  or	
  bad?	
  
NIF	
  is	
  trying	
  to	
  make	
  it	
  easier	
  to	
  work	
  with	
  diverse	
  data	
  
NIF	
  is	
  in	
  a	
  unique	
  posiNon	
  to	
  answer	
  quesNons	
  about	
  the	
  neuroscience	
  
landscape:	
  	
  Kepler	
  Workflow	
  engine	
  +	
  NIF	
  semanNcs	
  
Where	
  are	
  the	
  data?	
  
Striatum	
  
Hypothalamus	
  
Olfactory	
  bulb	
  
Cerebral	
  cortex	
  
Brain	
  
Brain	
  region	
  
Data	
  source	
  
∞	
  
What	
  is	
  easily	
  machine	
  
processable	
  and	
  accessible	
  
What	
  is	
  potenNally	
  knowable	
  
What	
  is	
  known:	
  
Literature,	
  images,	
  human	
  
knowledge	
  
Unstructured;	
  	
  
Natural	
  language	
  
processing,	
  enNty	
  
recogniNon,	
  image	
  
processing	
  and	
  
analysis;	
  paywalls	
  
communicaNon	
  
Abstracts	
  vs	
  full	
  
text	
  vs	
  tables	
  etc	
  
Closed	
  world	
  vs	
  open	
  world	
  
We	
  know	
  a	
  lot	
  about	
  some	
  things	
  and	
  less	
  about	
  others;	
  	
  some	
  
of	
  NIF’s	
  sources	
  are	
  comprehensive;	
  	
  others	
  are	
  highly	
  biased	
  
But...NIF	
  has	
  >	
  2M	
  anNbodies,	
  
338,000	
  model	
  organisms,	
  and	
  3	
  
million	
  microarray	
  records	
  
Neocortex	
  
Olfactory	
  bulb	
  
Neostriatum	
  
Cochlear	
  nucleus	
  
All	
  neurons	
  with	
  cell	
  bodies	
  in	
  the	
  same	
  brain	
  region	
  are	
  grouped	
  
together	
  
ProperNes	
  in	
  Neurolex	
  
Exposing	
  knowledge	
  gaps	
  and	
  biases	
  
Where	
  are	
  the	
  data?	
  
Striatum	
  
Hypothalamus	
  
Olfactory	
  bulb	
  
Cerebral	
  cortex	
  
Brain	
  
Brain	
  region	
  
Data	
  source	
   Funding	
  
•  Gemma:	
  	
  Gene	
  ID	
  	
  +	
  Gene	
  Symbol	
  
•  DRG:	
  	
  Gene	
  name	
  +	
  Probe	
  ID	
  
•  Gemma	
  presented	
  results	
  relaNve	
  to	
  baseline	
  chronic	
  
morphine;	
  	
  DRG	
  with	
  respect	
  to	
  saline,	
  so	
  direcNon	
  of	
  change	
  is	
  
opposite	
  in	
  the	
  2	
  databases	
  
• 	
  	
  	
  	
  	
  Analysis:	
  
• 1370	
  statements	
  from	
  Gemma	
  regarding	
  gene	
  expression	
  as	
  a	
  funcNon	
  of	
  chronic	
  
morphine	
  
• 617	
  were	
  consistent	
  with	
  DRG;	
  	
  	
  over	
  half	
  	
  of	
  the	
  claims	
  of	
  the	
  paper	
  were	
  not	
  
confirmed	
  in	
  this	
  analysis	
  
• Results	
  for	
  1	
  gene	
  were	
  opposite	
  in	
  DRG	
  and	
  Gemma	
  
• 45	
  did	
  not	
  have	
  enough	
  informaNon	
  provided	
  in	
  the	
  paper	
  to	
  make	
  a	
  judgment	
  
RelaNvely	
  simple	
  standards	
  would	
  make	
  life	
  easier	
  
NIF	
  favors	
  a	
  hybrid,	
  Nered,	
  
federated	
  system	
  
•  Domain	
  knowledge	
  
–  Ontologies	
  
•  Claims,	
  models	
  and	
  
observaNons	
  
–  Virtuoso	
  RDF	
  triples	
  	
  
–  Model	
  repositories	
  
•  Data	
  
–  Data	
  federaNon	
  
–  SpaNal	
  data	
  
–  Workflows	
  
•  NarraNve	
  
–  Full	
  text	
  access	
  
Neuron	
   Brain	
  part	
   Disease	
  
Organism	
   Gene	
  
Caudate	
  projects	
  to	
  
Snpc	
   Grm1	
  is	
  upregulated	
  in	
  
chronic	
  cocaine	
  
Betz	
  cells	
  
degenerate	
  in	
  ALS	
  
NIF	
  provides	
  the	
  tentacles	
  that	
  connect	
  the	
  pieces:	
  	
  a	
  
new	
  type	
  of	
  enNty	
  for	
  21st	
  century	
  science	
  
Technique	
  
People	
  
Scholar	
  
Library	
  
Scholar	
  
Publisher	
  
FORCE11.org:	
  	
  Future	
  of	
  research	
  communicaNons	
  and	
  e-­‐scholarship	
  
Scholar	
  
Consumer	
  
Libraries	
  
Data	
  Repositories	
  
Code	
  Repositories	
  
Community	
  databases/
pla}orms	
  
OA	
  
Curators	
  
Social	
  
Networks	
  
Social	
  
Networks	
  Social	
  
Networks	
  
Peer	
  Reviewers	
  
NarraNve	
  
Workflows	
  
Data	
  
Models	
  
MulNmedia	
  
NanopublicaNons	
  
Code	
  
•  Of	
  the	
  ~	
  4000	
  columns	
  
that	
  NIF	
  queries,	
  
~1300	
  map	
  to	
  one	
  of	
  
our	
  core	
  categories:	
  
–  Organism	
  
–  Anatomical	
  structure	
  
–  Cell	
  
–  Molecule	
  
–  FuncNon	
  
–  DysfuncNon	
  
–  Technique	
  
•  30-­‐50%	
  of	
  NIF’s	
  
queries	
  autocomplete	
  
•  When	
  NIF	
  combines	
  
mulNple	
  sources,	
  a	
  set	
  
of	
  common	
  fields	
  
emerges	
  
–  >Basic	
  informaNon	
  
models/semanNc	
  
models	
  exist	
  for	
  
certain	
  types	
  of	
  
enNNes	
  
SemanNc	
  frameworks	
  create	
  spaces	
  in	
  which	
  to	
  compare	
  the	
  current	
  state	
  of	
  
data	
  and	
  knowledge	
  
•  Several	
  powerful	
  trends	
  should	
  change	
  the	
  way	
  we	
  think	
  about	
  our	
  
data:	
  	
  One	
  	
  Many	
  
–  Many	
  data	
  
•  GeneraNon	
  of	
  data	
  is	
  geong	
  easier	
  	
  shared	
  data	
  
•  Data	
  space	
  is	
  geong	
  richer:	
  	
  more	
  –omes	
  everyday	
  
•  But...compared	
  to	
  the	
  biological	
  space,	
  sNll	
  sparse	
  
–  Many	
  resources:	
  	
  everyone	
  wants	
  to	
  be	
  “the”	
  one	
  but	
  e	
  pluribus	
  unum	
  
–  Many	
  eyes	
  
•  Wisdom	
  of	
  crowds	
  
•  More	
  than	
  one	
  way	
  to	
  interpret	
  data	
  
–  Many	
  algorithms	
  
•  Not	
  a	
  single	
  way	
  to	
  analyze	
  data	
  
–  Many	
  analyNcs	
  
•  “Signatures”	
  in	
  data	
  may	
  not	
  be	
  directly	
  related	
  to	
  the	
  quesNon	
  for	
  which	
  they	
  
were	
  acquired	
  but	
  tell	
  us	
  something	
  really	
  interesNng	
  
New	
  works	
  need	
  to	
  be	
  created	
  with	
  an	
  eye	
  
towards	
  the	
  web	
  and	
  interoperability	
  
Jeff	
  Grethe,	
  UCSD,	
  Co	
  InvesNgator,	
  Interim	
  PI	
  
Amarnath	
  Gupta,	
  UCSD,	
  Co	
  InvesNgator	
  
Anita	
  Bandrowski,	
  NIF	
  Project	
  Leader	
  
Gordon	
  Shepherd,	
  Yale	
  University	
  
Perry	
  Miller	
  
Luis	
  Marenco	
  
Rixin	
  Wang	
  
David	
  Van	
  Essen,	
  Washington	
  University	
  
Erin	
  Reid	
  
Paul	
  Sternberg,	
  Cal	
  Tech	
  
Arun	
  Rangarajan	
  
Hans	
  Michael	
  Muller	
  
Yuling	
  Li	
  
Giorgio	
  Ascoli,	
  George	
  Mason	
  University	
  
Sridevi	
  Polavarum	
  
Fahim	
  Imam	
  
Larry	
  Lui	
  
Andrea	
  Arnaud	
  Stagg	
  
Jonathan	
  Cachat	
  
Jennifer	
  Lawrence	
  
Svetlana	
  Sulima	
  
Davis	
  Banks	
  
Vadim	
  Astakhov	
  
Xufei	
  Qian	
  
Chris	
  Condit	
  
Mark	
  Ellisman	
  
Stephen	
  Larson	
  
Willie	
  Wong	
  
Tim	
  Clark,	
  Harvard	
  University	
  
Paolo	
  Ciccarese	
  
Karen	
  Skinner,	
  NIH,	
  Program	
  Officer	
  
(reNred)	
  
Jonathan	
  Pollock,	
  NIH,	
  Program	
  Officer	
  
And	
  my	
  colleagues	
  in	
  Monarch,	
  dkNet,	
  3DVC,	
  Force	
  11	
  
Data	
  Space	
  
Laboratory	
  
Space	
  
Knowledge	
  
Space	
  
BAMS	
  
Lexicon	
  
Encyclopedia	
  
47/50	
  major	
  preclinical	
  
published	
  cancer	
  studies	
  
could	
  not	
  be	
  replicated	
  
•  “The	
  scienNfic	
  community	
  
assumes	
  that	
  the	
  claims	
  in	
  a	
  
preclinical	
  study	
  can	
  be	
  taken	
  at	
  
face	
  value-­‐that	
  although	
  there	
  
might	
  be	
  some	
  errors	
  in	
  detail,	
  
the	
  main	
  message	
  of	
  the	
  paper	
  
can	
  be	
  relied	
  on	
  and	
  the	
  data	
  
will,	
  for	
  the	
  most	
  part,	
  stand	
  the	
  
test	
  of	
  Nme.	
  	
  Unfortunately,	
  this	
  
is	
  not	
  always	
  the	
  case.”	
  	
  
•  Geong	
  data	
  out	
  sooner	
  in	
  a	
  
form	
  where	
  they	
  can	
  be	
  
exposed	
  to	
  many	
  eyes	
  and	
  
many	
  analyses	
  may	
  allow	
  us	
  
to	
  expose	
  errors	
  and	
  develop	
  
bePer	
  metrics	
  to	
  evaluate	
  the	
  
validity	
  of	
  data	
  
Begley	
  and	
  Ellis,	
  29	
  MARCH	
  2012	
  |	
  VOL	
  483	
  |	
  
NATURE	
  |	
  531	
  
•  Every	
  resource	
  is	
  resource	
  limited:	
  	
  few	
  have	
  enough	
  Nme,	
  money,	
  
staff	
  or	
  	
  experNse	
  required	
  to	
  do	
  everything	
  they	
  would	
  like	
  
–  If	
  the	
  market	
  can	
  support	
  11	
  MRI	
  databases,	
  fine	
  
–  Some	
  consolidaNon,	
  coordinaNon	
  is	
  usually	
  warranted	
  
•  Big,	
  broad	
  and	
  messy	
  beats	
  small,	
  narrow	
  and	
  neat	
  
–  Without	
  trying	
  to	
  integrate	
  a	
  lot	
  of	
  data,	
  we	
  will	
  not	
  know	
  what	
  needs	
  to	
  be	
  done	
  
–  Progressive	
  refinement;	
  	
  addiNon	
  of	
  complexity	
  through	
  layers	
  
•  Be	
  flexible	
  and	
  opportunisNc	
  
–  A	
  single	
  	
  opNmal	
  technology/container	
  for	
  all	
  types	
  of	
  scienNfic	
  data	
  and	
  informaNon	
  
does	
  not	
  exist;	
  	
  technology	
  is	
  changing	
  
•  Think	
  globally;	
  	
  act	
  locally:	
  
–  No	
  source,	
  not	
  even	
  NIF,	
  is	
  THE	
  source;	
  	
  we	
  are	
  all	
  a	
  source	
  
–  Think	
  about	
  interoperaNon	
  from	
  the	
  incepNon	
  
Regional	
  part	
  of	
  
nervous	
  system	
   ParcellaNon	
  
scheme	
  parcel	
  
ParcellaNon	
  
scheme	
  parcel	
  
Single	
  species	
  or	
  strain	
  
ParcellaNon	
  scheme	
  
Precise	
  definiNon	
  
Technique	
  
INCF	
  Task	
  Force:	
  	
  Alan	
  Rutenberg,	
  	
  Seth	
  Ruffins	
  	
  
FuncNonal	
  part	
  of	
  
nervous	
  system	
  
ParNally	
  overlaps	
  
Taxon	
  rank	
  
General	
  hierarchy	
  
 1200	
  parts	
  of	
  nervous	
  
system	
  characterized	
  
(mostly)	
  	
  according	
  to	
  
CUMBO	
  terms	
  
 1200	
  “parcels”	
  from	
  
individual	
  atlases/papers	
  
 700	
  neurons	
  
 280	
  via	
  Neuron	
  
Registry	
  
 Available	
  via	
  NIF	
  
vocabulary	
  services	
  (REST)	
  
 Hosted	
  in	
  a	
  Virtuoso	
  
triple	
  store	
  via	
  SPARQL	
  

Mais conteúdo relacionado

Mais procurados

Data Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow ManagementData Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow ManagementNeuroMat
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...Neuroscience Information Framework
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Amit Sheth
 
Experiences in building an ontology driven image database for ...
Experiences in building an ontology driven image database for ...Experiences in building an ontology driven image database for ...
Experiences in building an ontology driven image database for ...Carla Lima
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Hakky St
 
Reusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giantsReusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giantsKrzysztof Gorgolewski
 
Neural Computing
Neural ComputingNeural Computing
Neural ComputingESCOM
 
Осадчий А.Е. Анализ многомерных магнито- и электроэнцефалографических данных ...
Осадчий А.Е. Анализ многомерных магнито- и электроэнцефалографических данных ...Осадчий А.Е. Анализ многомерных магнито- и электроэнцефалографических данных ...
Осадчий А.Е. Анализ многомерных магнито- и электроэнцефалографических данных ...bigdatabm
 
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...Numenta
 
Knowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysisKnowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysisCOST action BM1006
 
An Adaptive Filter-Framework for the Quality Improvement of Open-Source Softw...
An Adaptive Filter-Framework for the Quality Improvement of Open-Source Softw...An Adaptive Filter-Framework for the Quality Improvement of Open-Source Softw...
An Adaptive Filter-Framework for the Quality Improvement of Open-Source Softw...Anna Glukhova
 
Enhancing Data Integration with Text Analysis to Find Genes Implicated in Pla...
Enhancing Data Integration with Text Analysis to Find Genes Implicated in Pla...Enhancing Data Integration with Text Analysis to Find Genes Implicated in Pla...
Enhancing Data Integration with Text Analysis to Find Genes Implicated in Pla...Catherine Canevet
 
Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsAlexander Pico
 
ICDMWorkshopProposal.doc
ICDMWorkshopProposal.docICDMWorkshopProposal.doc
ICDMWorkshopProposal.docbutest
 
A Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the NeurosciencesA Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the NeurosciencesMedicineAndHealthNeurolog
 

Mais procurados (19)

NEUROINFORMATICS
NEUROINFORMATICSNEUROINFORMATICS
NEUROINFORMATICS
 
Data Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow ManagementData Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow Management
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...
 
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
Experiences in building an ontology driven image database for ...
Experiences in building an ontology driven image database for ...Experiences in building an ontology driven image database for ...
Experiences in building an ontology driven image database for ...
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...
 
Reusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giantsReusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giants
 
Neural Computing
Neural ComputingNeural Computing
Neural Computing
 
Осадчий А.Е. Анализ многомерных магнито- и электроэнцефалографических данных ...
Осадчий А.Е. Анализ многомерных магнито- и электроэнцефалографических данных ...Осадчий А.Е. Анализ многомерных магнито- и электроэнцефалографических данных ...
Осадчий А.Е. Анализ многомерных магнито- и электроэнцефалографических данных ...
 
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
 
Knowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysisKnowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysis
 
LuZhangCV
LuZhangCVLuZhangCV
LuZhangCV
 
An Adaptive Filter-Framework for the Quality Improvement of Open-Source Softw...
An Adaptive Filter-Framework for the Quality Improvement of Open-Source Softw...An Adaptive Filter-Framework for the Quality Improvement of Open-Source Softw...
An Adaptive Filter-Framework for the Quality Improvement of Open-Source Softw...
 
Enhancing Data Integration with Text Analysis to Find Genes Implicated in Pla...
Enhancing Data Integration with Text Analysis to Find Genes Implicated in Pla...Enhancing Data Integration with Text Analysis to Find Genes Implicated in Pla...
Enhancing Data Integration with Text Analysis to Find Genes Implicated in Pla...
 
Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network Representations
 
ICDMWorkshopProposal.doc
ICDMWorkshopProposal.docICDMWorkshopProposal.doc
ICDMWorkshopProposal.doc
 
A Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the NeurosciencesA Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the Neurosciences
 

Semelhante a Neuroscience Data Needs Multiple Databases and Formats

RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkRDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkASIS&T
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...Maryann Martone
 
The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework Neuroscience Information Framework
 
The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...Neuroscience Information Framework
 
Big data from small data: A deep survey of the neuroscience landscape data via
Big data from small data:  A deep survey of the neuroscience landscape data viaBig data from small data:  A deep survey of the neuroscience landscape data via
Big data from small data: A deep survey of the neuroscience landscape data viaNeuroscience Information Framework
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemMaryann Martone
 
The Neuroscience Information Framework: Making Resources Discoverable for the...
The Neuroscience Information Framework: Making Resources Discoverable for the...The Neuroscience Information Framework: Making Resources Discoverable for the...
The Neuroscience Information Framework: Making Resources Discoverable for the...Neuroscience Information Framework
 
Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework. Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework. Neuroscience Information Framework
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...Neuroscience Information Framework
 
Data Landscapes: The Neuroscience Information Framework
Data Landscapes:  The Neuroscience Information FrameworkData Landscapes:  The Neuroscience Information Framework
Data Landscapes: The Neuroscience Information FrameworkMaryann Martone
 
The UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overviewThe UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overviewVictoria Perreau
 
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...Artificial Intelligence Institute at UofSC
 
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...Neuroscience Information Framework
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EITESANGO
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Neuroscience Information Framework
 
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...Maryann Martone
 
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neuroscience Information Framework
 

Semelhante a Neuroscience Data Needs Multiple Databases and Formats (20)

RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkRDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
 
Neuroscience as networked science
Neuroscience as networked scienceNeuroscience as networked science
Neuroscience as networked science
 
The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework
 
The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...
 
Big data from small data: A deep survey of the neuroscience landscape data via
Big data from small data:  A deep survey of the neuroscience landscape data viaBig data from small data:  A deep survey of the neuroscience landscape data via
Big data from small data: A deep survey of the neuroscience landscape data via
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystem
 
Data Landscapes - Addiction
Data Landscapes - AddictionData Landscapes - Addiction
Data Landscapes - Addiction
 
The Neuroscience Information Framework: Making Resources Discoverable for the...
The Neuroscience Information Framework: Making Resources Discoverable for the...The Neuroscience Information Framework: Making Resources Discoverable for the...
The Neuroscience Information Framework: Making Resources Discoverable for the...
 
Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework. Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework.
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
 
Data Landscapes: The Neuroscience Information Framework
Data Landscapes:  The Neuroscience Information FrameworkData Landscapes:  The Neuroscience Information Framework
Data Landscapes: The Neuroscience Information Framework
 
NIFSTD: A Comprehensive Ontology for Neuroscience
NIFSTD: A Comprehensive Ontology for NeuroscienceNIFSTD: A Comprehensive Ontology for Neuroscience
NIFSTD: A Comprehensive Ontology for Neuroscience
 
The UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overviewThe UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overview
 
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
 
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
 
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
 
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
 

Mais de Maryann Martone

Dk net webinar tutorial pen
Dk net webinar tutorial penDk net webinar tutorial pen
Dk net webinar tutorial penMaryann Martone
 
Guided tutorial of the Neuroscience Information Framework
Guided tutorial of the Neuroscience Information FrameworkGuided tutorial of the Neuroscience Information Framework
Guided tutorial of the Neuroscience Information FrameworkMaryann Martone
 
Resource Identification Initiative
Resource Identification InitiativeResource Identification Initiative
Resource Identification InitiativeMaryann Martone
 
Open Access and Research Communication: The Perspective of Force11
Open Access and Research Communication: The Perspective of Force11Open Access and Research Communication: The Perspective of Force11
Open Access and Research Communication: The Perspective of Force11Maryann Martone
 
FORCE11: Future of Research Communications and e-Scholarship
FORCE11:  Future of Research Communications and e-ScholarshipFORCE11:  Future of Research Communications and e-Scholarship
FORCE11: Future of Research Communications and e-ScholarshipMaryann Martone
 

Mais de Maryann Martone (6)

Dk net webinar tutorial pen
Dk net webinar tutorial penDk net webinar tutorial pen
Dk net webinar tutorial pen
 
Guided tutorial of the Neuroscience Information Framework
Guided tutorial of the Neuroscience Information FrameworkGuided tutorial of the Neuroscience Information Framework
Guided tutorial of the Neuroscience Information Framework
 
Resource Identification Initiative
Resource Identification InitiativeResource Identification Initiative
Resource Identification Initiative
 
Open Access and Research Communication: The Perspective of Force11
Open Access and Research Communication: The Perspective of Force11Open Access and Research Communication: The Perspective of Force11
Open Access and Research Communication: The Perspective of Force11
 
FORCE11: Future of Research Communications and e-Scholarship
FORCE11:  Future of Research Communications and e-ScholarshipFORCE11:  Future of Research Communications and e-Scholarship
FORCE11: Future of Research Communications and e-Scholarship
 
Alpsp final martone
Alpsp final martoneAlpsp final martone
Alpsp final martone
 

Último

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Último (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

Neuroscience Data Needs Multiple Databases and Formats

  • 1. Maryann  E.    Martone,  Ph.  D.   University  of  California,  San  Diego  
  • 2. Neuroscience  is  unlikely  to  be  served  by   a  few  large  databases  like  the  genomics   and  proteomics  community   Whole  brain  data   (20  um   microscopic  MRI)   Mosiac  LM   images  (1  GB+)   ConvenNonal  LM   images   Individual  cell   morphologies   EM  volumes  &   reconstrucNons   Solved  molecular   structures   No  single  technology  serves   these  all  equally  well.    Mul6ple  data  types;     mul6ple  scales;    mul6ple   databases  
  • 4.
  • 5. •  NIF’s  mission  is  to  maximize  the  awareness  of,  access  to   and  uNlity  of  research  resources  produced  worldwide  to   enable  bePer  science  and  promote  efficient  use   –  NIF  unites  neuroscience  informaNon  without  respect  to  domain,   funding  agency,  insNtute  or  community   –  NIF  is  like  a  “Pub  Med”  for  all  biomedical  resources  and  a  “Pub   Med  Central”  for  databases   –  Makes  them  searchable  from  a  single  interface   –  PracNcal  and  cost-­‐effecNve;    tries  to  be  sensible   –  Learned  a  lot  about  current  data  prac6ces   The  Neuroscience  InformaNon  Framework  is  an  iniNaNve  of  the   NIH  Blueprint  consorNum  of  insNtutes        hPp://neuinfo.org  
  • 6. h=p://neuinfo.org   June10,  2013   dkCOIN  InvesNgator's  Retreat   6   •  A  portal  for  finding  and  using   neuroscience  resources     A  consistent  framework  for   describing  resources     Provides  simultaneous   search  of  mulNple  types  of   informaNon,  organized  by   category     Supported  by  an  expansive   ontology  for  neuroscience     UNlizes  advanced   technologies  to  search  the   “hidden  web”   UCSD,  Yale,  Cal  Tech,  George  Mason,  Washington  Univ   Literature   Database   FederaNon   Registry  
  • 7. We’d  like  to  be  able  to  find:   •  What  is  known****:   –  What  are  the  projecNons  of  hippocampus?   –  Is  GRM1  expressed  In  cerebral  cortex?   –  What  genes  have  been  found  to  be  upregulated  in   chronic  drug  abuse  in  adults   –  What  animal  models  have  similar  phenotypes  to   Parkinson’s  disease?   –  What  studies  used  my  polyclonal  anNbody  against   GABA  in  humans?   •  What  is  not  known:   –  ConnecNons  among  data   –  Gaps  in  knowledge   A  framework  makes  it  easier  to  address  these  quesNons  
  • 8.
  • 9. With  the  thousands  of  databases  and  other  informaNon  sources   available,  simple  descripNve  metadata  will  not  suffice  
  • 10. • NIF  curators   • NominaNon  by  the   community   • Semi-­‐automated  text   mining  pipelines    NIF  Registry    Requires  no  special   skills    Site  map  available   for  local  hosNng   • NIF  Data  FederaNon   • DISCO  interop   • Requires  some   programming  skill   • Open  Source  Brain  <   2  hr   Two  Nered  system:    low  barrier  to  entry  
  • 11. Current   Planned   DISCO  Dashboard  Func6ons   •  Ingest  Script  Manager   •  Public  Script  Repository   •  Data  &  Event  Tracker   •  Versioning  System   •  Curator  Tool     •  Data  Transformer  Manager   June10,  2013   dkCOIN  InvesNgator's  Retreat   11  Luis  Marenco,  Rixin  Wang,  Perrry  Miller,  Gordon  Shepherd   Yale  University  
  • 12. NIF  was  designed  to  be  populated  rapidly   with  progressive  refinement  
  • 13. Databases  come  in  many  shapes  and  sizes   •  Primary  data:   –  Data  available  for  reanalysis,  e.g.,   microarray  data  sets  from  GEO;     brain  images  from  XNAT;     microscopic  images  (CCDB/CIL)   •  Secondary  data   –  Data  features  extracted  through   data  processing  and  someNmes   normalizaNon,  e.g,  brain  structure   volumes  (IBVD),  gene  expression   levels  (Allen  Brain  Atlas);    brain   connecNvity  statements  (BAMS)   •  TerNary  data   –  Claims  and  asserNons  about  the   meaning  of  data   •  E.g.,  gene  upregulaNon/ downregulaNon,  brain   acNvaNon  as  a  funcNon  of  task   •  Registries:   –  Metadata   –  Pointers  to  data  sets  or   materials  stored  elsewhere   •  Data  aggregators   –  Aggregate  data  of  the  same   type  from  mulNple  sources,   e.g.,  Cell  Image   Library  ,SUMSdb,  Brede   •  Single  source   –  Data  acquired  within  a  single   context  ,  e.g.,  Allen  Brain  Atlas   Researchers  are  producing  a  variety  of   informaNon  arNfacts  using  a  mulNtude  of   technologies  
  • 14. Hippocampus  OR  “Cornu  Ammonis”  OR   “Ammon’s  horn”   Query  expansion:    Synonyms   and  related  concepts   Boolean  queries   Data  sources   categorized  by   “data  type”  and   level  of  nervous   system   Common  views   across  mulNple   sources   Tutorials  for  using   full  resource  when   geong  there  from   NIF   Link  back  to   record  in   original  source  
  • 15. Connects  to   Synapsed  with   Synapsed  by   Input  region   innervates   Axon  innervates   Projects  to  Cellular  contact   Subcellular  contact   Source  site   Target    site   Each  resource  implements  a  different,  though  related  model;     systems  are  complex  and  difficult  to  learn,  in  many  cases  
  • 16.
  • 17. •  You  (and  the  machine)  have  to  be  able  to  find  it   –  Accessible  through  the  web   –  Structured  or  semi-­‐structured   –  AnnotaNons   •  You  (and  the  machine)    have  to  be  able  to  use  it   –  Data  type  specified  and  in  an  acNonable  form   •  You  (and  the  machine)  have  to  know  what  the  data   mean   •  SemanNcs   •  Context:    Experimental  metadata   •  Provenance:    where  did  they  come  from  
  • 18. Knowledge  in  space  and  spaNal  relaNonships   (the  “where”)   Knowledge  in  words,  terminologies  and   logical  relaNonships  (the  “what”)  
  • 19. Purkinje   Cell   Axon   Terminal   Axon   DendriNc   Tree   DendriNc   Spine   Dendrite   Cell  body   Cerebellar   cortex   There  is  liPle  obvious  connecNon  between   data  sets  taken  at  different  scales  using   different  microscopies  without  an  explicit   representaNon  of  the  biological  objects  that   the  data  represent  
  • 20. •  NIF  covers  mulNple  structural  scales  and  domains  of  relevance  to  neuroscience   •  Aggregate  of  community  ontologies  with  some  extensions  for  neuroscience,  e.g.,  Gene   Ontology,  Chebi,  Protein  Ontology   NIFSTD   Organism   NS  FuncNon  Molecule   InvesNgaNon   Subcellular   structure   Macromolecule   Gene   Molecule  Descriptors   Techniques   Reagent   Protocols   Cell   Resource   Instrument   DysfuncNon   Quality   Anatomical   Structure  
  • 21. Brain   Cerebellum   Purkinje  Cell  Layer   Purkinje  cell   neuron   has  a   has  a   has  a   is  a   •  Ontology:  an  explicit,  formal   representaNon  of  concepts     relaNonships  among  them  within   a  parNcular  domain  that   expresses  human  knowledge  in  a   machine  readable  form   •  Branch  of  philosophy:    a  theory   of  what  is   •  e.g.,  Gene  ontologies  
  • 22. •  Express  neuroscience  concepts  in  a  way  that  is  machine  readable     –  Synonyms,  lexical  variants   –  DefiniNons   •  Provide  means  of  disambiguaNon  of  strings   –  Nucleus  part  of  cell;    nucleus  part  of  brain;    nucleus  part  of  atom   •  Rules  by  which  a  class  is  defined,  e.g.,  a  GABAergic  neuron  is  neuron  that  releases  GABA  as  a   neurotransmiPer   •  ProperNes   –  Support  reasoning   •  Provide  universals  for  navigaNng  across  different  data  sources   –  SemanNc  “index”   –  Link  data  through  relaNonships  not  just  one-­‐to-­‐one  mappings   •  Provide  the  basis  for  concept-­‐based  queries  to  probe  and  mine  data   •  Establish  a  semanNc  framework  for  landscape  analysis   MathemaNcs,  Computer  code  or  Esperanto  
  • 23. birnlex_1732   Brodmann.1   Explicit  mapping  of  database  content  helps  disambiguate  non-­‐unique  and  custom   terminology  
  • 24. June10,  2013   24   Aligns  sources  to  the  NIF  semanNc  framework  
  • 25.
  • 26. •  Search  Google:    GABAergic  neuron   •  Search  NIF:    GABAergic  neuron   –  NIF  automaNcally  searches  for  types  of   GABAergic  neurons   Types  of  GABAergic   neurons   Search by meaning not by string
  • 27. Equivalence  classes;    restricNons   Arbitrary  but  defensible   • Neurons  classified  by   • Circuit  role:    principal  neuron  vs   interneuron   • Molecular  consNtuent:    Parvalbumin-­‐ neurons,  calbindin-­‐neurons   • Brain  region:    Cerebellar  neuron   • Morphology:    Spiny  neuron   •   Molecule  Roles:    Drug  of  abuse,  anterograde   tracer,  retrograde  tracer   • Brain  parts:    Circumventricular  organ   • Organisms:    Non-­‐human  primate,  non-­‐human   vertebrate   • QualiNes:    Expression  level   • Techniques:    Neuroimaging  
  • 28. What  genes  are  upregulated  by  drugs  of  abuse  in  the   adult  mouse?  (show  me  the  data!)   Morphine   Increased   expression   Adult  Mouse  
  • 29. • NIF  ConnecNvity:    7  databases  containing  connecNvity  primary  data  or  claims   from  literature  on  connecNvity  between  brain  regions   • Brain  Architecture  Management  System  (rodent)   • Temporal  lobe.com  (rodent)   • Connectome  Wiki  (human)   • Brain  Maps  (various)   • CoCoMac  (primate  cortex)   • UCLA  MulNmodal  database  (Human  fMRI)   • Avian  Brain  ConnecNvity  Database  (Bird)   • Total:    1800  unique  brain  terms  (excluding  Avian)   • Number  of  exact  terms  used  in  >  1  database:    42   • Number  of  synonym  matches:    99   • Number  of  1st  order  partonomy  matches:    385  
  • 30. hPp://neurolex.org   • SemanNc  MediWiki   • Provide  a  simple  interface   for  defining  the  concepts   required   • Light  weight  semanNcs   • Good  teaching  tool  for   learning  about  semanNc   integraNon  and  the  benefits  of   a  consistent  semanNc   framework   • Community  based:   • Anyone  can  contribute  their   terms,  concepts,  things   • Anyone  can  edit   • Anyone  can  link   • Accessible:    searched  by  Google   • Growing  into  a  significant   knowledge  base  for   neuroscience   • InternaNonal  NeuroinformaNcs   CoordinaNng  Facility     Demo    D03   Larson  et  al,  FronNers  in  NeuroinformaNcs,  in  press  
  • 31. •  Neurolex  provides  an   on-­‐line  computable   index  for  expressing   models  in  semanNc   terms,  and  linking  to   other  knowledge  and   data   •  Implemented  forms   for  certain  types  of   enNNes   •  Neuroscience   knowledge  in  the  web   Pages  are  linked  through  properNes;    Knowledge-­‐base  built  through  cross-­‐ modular  relaNons  and  links  to  data;    red  links  
  • 32. •  >  1000  Dicom  Terms   –  Karl  Helmer   –  Data  Sharing  Task  Force   •  Tasks  and  CogniNve  Concepts   from  CogniNve  Atlas   –  Russ  Poldrack   •  >280  Neurons   –  Gordon  Shepherd  and  30  world   wide  experts   •  ~500  fly  neurons  from  Fly   Anatomy  Ontology   –  David  Osumi-­‐Sutherland   •  >1200  Brain  parcellaNons   `20,000  concepts:      Spreadsheet  downloads,  through  NIF  Web  Services,   SPARQL  endpoint    200,000   edits    150   contributors  
  • 33. Because  they  are  staNc  URL’s,  Wikis  are  searchable  by   Google  
  • 34. Neurolex:    >  1  million  triples Dr.  Yi  Zeng:    Chinese  neural  knowledge  base   NIF  Cell  Graph  
  • 35. 1.  Look  brain  region  up  in  NeuroLex   2.  Look  up  cells  contained  in  the  brain   region   3.  Find  those  cells  that  are  known  to  project   out  of  that  brain  region   4.  Look  up  the  neurotransmiPers  for  those   cells   5.  Determine  whether  those   neurotransmiPers  are  known  to  be   excitatory  or  inhibitory   6.  Report  the  projecNon  as  excitatory  or   inhibitory,  and  report  the  enNre  chain  of   logic  with  links  back  to  the  wiki  pages   where  they  were  made   7.  Make  sure  user  can  get  back  to  each   statement  in  the  logic  chain  to  edit  it  if   they  think  it  is  wrong   Stephen  Larson   CHEBI:18243   Are  projecNons  from  the  VTA  excitatory   or  inhibitory?  
  • 36. •  INCF  Project   –  Neuron  Registry   –  >  30  experts   worldwide   –  Fill  out  neuron   pages  in  Neurolex   Wiki   –  Led  by  Dr.  Gordon   Shepherd   Soma  locaNon   Dendrite  locaNon   Axon  locaNon   0   50   100   150   200   250   300   Number   Total   redlinks   easy  fixes   hard  fixes   Soma  locaNon   Dendrite  locaNon   Axon  locaNon   Social  networks  and  community  sites  let  us  learn  things  from  the   collecNve  behavior  of  contributors  
  • 37. 37   neurolex.org: Semantic Wiki • INCF Community encyclopedia • Define all vocabulary, terms, protocols, brain structures, diseases, etc • Living review articles • Links to data, models and literature • Semantic organization, search, analysis and integration • Searchable via the web • Global directory of all shared vocabularies, CDEs, etc Slide  courtesy  of  Sean  Hill:    InternaNonal  NeuroinformaNcs  CoordinaNng  Facility  
  • 38. MarNn  Telefont,  HBP:    Lab  Space  connecNng  to  Knowledge  Space  
  • 39. •  NIF  can  be  used  to  survey  the   data  landscape   •  Analysis  of  NIF  shows  mulNple   databases  with  similar  scope   and  content   •  Many  contain  parNally   overlapping  data   •  Data  “flows”  from  one   resource  to  the  next   –  Data  is  reinterpreted,  reanalyzed  or   added  to   •  Is  duplicaNon  good  or  bad?   NIF  is  trying  to  make  it  easier  to  work  with  diverse  data  
  • 40. NIF  is  in  a  unique  posiNon  to  answer  quesNons  about  the  neuroscience   landscape:    Kepler  Workflow  engine  +  NIF  semanNcs   Where  are  the  data?   Striatum   Hypothalamus   Olfactory  bulb   Cerebral  cortex   Brain   Brain  region   Data  source  
  • 41. ∞   What  is  easily  machine   processable  and  accessible   What  is  potenNally  knowable   What  is  known:   Literature,  images,  human   knowledge   Unstructured;     Natural  language   processing,  enNty   recogniNon,  image   processing  and   analysis;  paywalls   communicaNon   Abstracts  vs  full   text  vs  tables  etc  
  • 42. Closed  world  vs  open  world   We  know  a  lot  about  some  things  and  less  about  others;    some   of  NIF’s  sources  are  comprehensive;    others  are  highly  biased   But...NIF  has  >  2M  anNbodies,   338,000  model  organisms,  and  3   million  microarray  records  
  • 43. Neocortex   Olfactory  bulb   Neostriatum   Cochlear  nucleus   All  neurons  with  cell  bodies  in  the  same  brain  region  are  grouped   together   ProperNes  in  Neurolex  
  • 44. Exposing  knowledge  gaps  and  biases   Where  are  the  data?   Striatum   Hypothalamus   Olfactory  bulb   Cerebral  cortex   Brain   Brain  region   Data  source   Funding  
  • 45. •  Gemma:    Gene  ID    +  Gene  Symbol   •  DRG:    Gene  name  +  Probe  ID   •  Gemma  presented  results  relaNve  to  baseline  chronic   morphine;    DRG  with  respect  to  saline,  so  direcNon  of  change  is   opposite  in  the  2  databases   •           Analysis:   • 1370  statements  from  Gemma  regarding  gene  expression  as  a  funcNon  of  chronic   morphine   • 617  were  consistent  with  DRG;      over  half    of  the  claims  of  the  paper  were  not   confirmed  in  this  analysis   • Results  for  1  gene  were  opposite  in  DRG  and  Gemma   • 45  did  not  have  enough  informaNon  provided  in  the  paper  to  make  a  judgment   RelaNvely  simple  standards  would  make  life  easier  
  • 46. NIF  favors  a  hybrid,  Nered,   federated  system   •  Domain  knowledge   –  Ontologies   •  Claims,  models  and   observaNons   –  Virtuoso  RDF  triples     –  Model  repositories   •  Data   –  Data  federaNon   –  SpaNal  data   –  Workflows   •  NarraNve   –  Full  text  access   Neuron   Brain  part   Disease   Organism   Gene   Caudate  projects  to   Snpc   Grm1  is  upregulated  in   chronic  cocaine   Betz  cells   degenerate  in  ALS   NIF  provides  the  tentacles  that  connect  the  pieces:    a   new  type  of  enNty  for  21st  century  science   Technique   People  
  • 47. Scholar   Library   Scholar   Publisher   FORCE11.org:    Future  of  research  communicaNons  and  e-­‐scholarship  
  • 48. Scholar   Consumer   Libraries   Data  Repositories   Code  Repositories   Community  databases/ pla}orms   OA   Curators   Social   Networks   Social   Networks  Social   Networks   Peer  Reviewers   NarraNve   Workflows   Data   Models   MulNmedia   NanopublicaNons   Code  
  • 49. •  Of  the  ~  4000  columns   that  NIF  queries,   ~1300  map  to  one  of   our  core  categories:   –  Organism   –  Anatomical  structure   –  Cell   –  Molecule   –  FuncNon   –  DysfuncNon   –  Technique   •  30-­‐50%  of  NIF’s   queries  autocomplete   •  When  NIF  combines   mulNple  sources,  a  set   of  common  fields   emerges   –  >Basic  informaNon   models/semanNc   models  exist  for   certain  types  of   enNNes   SemanNc  frameworks  create  spaces  in  which  to  compare  the  current  state  of   data  and  knowledge  
  • 50. •  Several  powerful  trends  should  change  the  way  we  think  about  our   data:    One    Many   –  Many  data   •  GeneraNon  of  data  is  geong  easier    shared  data   •  Data  space  is  geong  richer:    more  –omes  everyday   •  But...compared  to  the  biological  space,  sNll  sparse   –  Many  resources:    everyone  wants  to  be  “the”  one  but  e  pluribus  unum   –  Many  eyes   •  Wisdom  of  crowds   •  More  than  one  way  to  interpret  data   –  Many  algorithms   •  Not  a  single  way  to  analyze  data   –  Many  analyNcs   •  “Signatures”  in  data  may  not  be  directly  related  to  the  quesNon  for  which  they   were  acquired  but  tell  us  something  really  interesNng   New  works  need  to  be  created  with  an  eye   towards  the  web  and  interoperability  
  • 51. Jeff  Grethe,  UCSD,  Co  InvesNgator,  Interim  PI   Amarnath  Gupta,  UCSD,  Co  InvesNgator   Anita  Bandrowski,  NIF  Project  Leader   Gordon  Shepherd,  Yale  University   Perry  Miller   Luis  Marenco   Rixin  Wang   David  Van  Essen,  Washington  University   Erin  Reid   Paul  Sternberg,  Cal  Tech   Arun  Rangarajan   Hans  Michael  Muller   Yuling  Li   Giorgio  Ascoli,  George  Mason  University   Sridevi  Polavarum   Fahim  Imam   Larry  Lui   Andrea  Arnaud  Stagg   Jonathan  Cachat   Jennifer  Lawrence   Svetlana  Sulima   Davis  Banks   Vadim  Astakhov   Xufei  Qian   Chris  Condit   Mark  Ellisman   Stephen  Larson   Willie  Wong   Tim  Clark,  Harvard  University   Paolo  Ciccarese   Karen  Skinner,  NIH,  Program  Officer   (reNred)   Jonathan  Pollock,  NIH,  Program  Officer   And  my  colleagues  in  Monarch,  dkNet,  3DVC,  Force  11  
  • 52. Data  Space   Laboratory   Space   Knowledge   Space   BAMS   Lexicon   Encyclopedia  
  • 53. 47/50  major  preclinical   published  cancer  studies   could  not  be  replicated   •  “The  scienNfic  community   assumes  that  the  claims  in  a   preclinical  study  can  be  taken  at   face  value-­‐that  although  there   might  be  some  errors  in  detail,   the  main  message  of  the  paper   can  be  relied  on  and  the  data   will,  for  the  most  part,  stand  the   test  of  Nme.    Unfortunately,  this   is  not  always  the  case.”     •  Geong  data  out  sooner  in  a   form  where  they  can  be   exposed  to  many  eyes  and   many  analyses  may  allow  us   to  expose  errors  and  develop   bePer  metrics  to  evaluate  the   validity  of  data   Begley  and  Ellis,  29  MARCH  2012  |  VOL  483  |   NATURE  |  531  
  • 54. •  Every  resource  is  resource  limited:    few  have  enough  Nme,  money,   staff  or    experNse  required  to  do  everything  they  would  like   –  If  the  market  can  support  11  MRI  databases,  fine   –  Some  consolidaNon,  coordinaNon  is  usually  warranted   •  Big,  broad  and  messy  beats  small,  narrow  and  neat   –  Without  trying  to  integrate  a  lot  of  data,  we  will  not  know  what  needs  to  be  done   –  Progressive  refinement;    addiNon  of  complexity  through  layers   •  Be  flexible  and  opportunisNc   –  A  single    opNmal  technology/container  for  all  types  of  scienNfic  data  and  informaNon   does  not  exist;    technology  is  changing   •  Think  globally;    act  locally:   –  No  source,  not  even  NIF,  is  THE  source;    we  are  all  a  source   –  Think  about  interoperaNon  from  the  incepNon  
  • 55.
  • 56. Regional  part  of   nervous  system   ParcellaNon   scheme  parcel   ParcellaNon   scheme  parcel   Single  species  or  strain   ParcellaNon  scheme   Precise  definiNon   Technique   INCF  Task  Force:    Alan  Rutenberg,    Seth  Ruffins     FuncNonal  part  of   nervous  system   ParNally  overlaps   Taxon  rank   General  hierarchy  
  • 57.  1200  parts  of  nervous   system  characterized   (mostly)    according  to   CUMBO  terms    1200  “parcels”  from   individual  atlases/papers    700  neurons    280  via  Neuron   Registry    Available  via  NIF   vocabulary  services  (REST)    Hosted  in  a  Virtuoso   triple  store  via  SPARQL