SlideShare uma empresa Scribd logo
1 de 16
ArXiv.org
250,000 documents
47,000 registered users
1 million+ downloads per year

Cost Per Paper
$10000

Commercial Journal

$1000

Non-Profit Journal

$10

arXiv
Goal: Process increasing number of submissions at
constant or declining cost
arXiv has an active core of users: 10% of users are
responsible for about 1/3 of all submissions, 50% of all
users have logged in (to submit or update a paper) in the
past 1.5 years
Authentication and Access Control
Recently moved from an http authentication/Berkeley database system
to a system based on cookies and a relational database.
Currently, all registered users (who haven’t been suspended) can
submit to all subjects classes in all archives – the original submitter or
somebody with the paper password can update the paper.
People are allowed to register depending on their E-mail address:
abc@university.edu can register, but xyz@company.com can’t unless
company=ibm,lucent,…; this list is hard to maintain (we have to block
popular ISPs in every country), exceptions are dealt with manually at
great cost (each case takes detective work), and there are many people
in .edu (alumni, non-research staff) who shouldn’t be able to submit.
Because registration and submission are linked, user database can’t be
used to offer other services: e-mail notification, personalization.
Endorsements and Trust Management
Administrators

Grandfathered Users

In new system, everyone will be able to register. Users who
registered under the old system will still be able to upload to
any archive or subject class, but new users will need to be
endorsed by an author with a publication history in that
category. Burden shifts from one senior staff person to 47,000
registered users. User database can be used
Endorsee

d
En

Endorser

en
m
rse
o

e
od
tc
Web-based interface for administrators:
• View user history and publications
• Monitor endorsement process
• Manage authority records
• Disable ability to submit or endorse
• Keep “institutional memory”
Future Directions
•Flexible Submission Queue (Currently submissions are
published the following evening – we can’t easily delay a
submission)
•Validating Metadata Form (Force users to clean up entry
errors, so administrators don’t have to)
• Automatic Protection (Suspicious submissions and
endorsements will be automatically delayed)
• New Search Engine based on Lucene
• Retrofit e-mail notification (current awareness) to use new
user database.
Classifying Articles with the
Support Vector Machine
Paul Ginsparg
Paul Houle
Thorsten Joachims
Jae-Hoon Sul
Goal: identify papers in existing archives that are relevant to
a new subject archive, q-bio (Quantitative Biology)
Active Training of SVM
Training: q-bio
Training: not q-bio
Other far from margin
Other close to margin

SVM finds maximum-margin hyperplane. We do first training run on one
year of data, then identify other papers that lie close to the dividing line.
We iteratively classify these by hand to refine the classification
Classifer performance improves as the size of a category
increases.
Time Series Analysis of Content
and Usage Information
Paul Ginsparg
Jon Kleinberg
Kleinberg’s algorithm uses a hidden Markov model to detect bursts of
word usage in arXiv titles, reveals intellectual trends in the last
decade of high-energy physics theory.
Announcement

Cited by other papers
Web Link Added

Review papers have a distinctive pattern of use: an initial spike after
announcement, followed by a long nearly-constant tail.

Mais conteúdo relacionado

Destaque

Semiclassical mechanics of a non-integrable spin cluster
Semiclassical mechanics of a non-integrable spin clusterSemiclassical mechanics of a non-integrable spin cluster
Semiclassical mechanics of a non-integrable spin clusterPaul Houle
 
Journalism and the Semantic Web
Journalism and the Semantic WebJournalism and the Semantic Web
Journalism and the Semantic WebKurt Cagle
 
How to Trace an E-mail Part 2
How to Trace an E-mail Part 2How to Trace an E-mail Part 2
How to Trace an E-mail Part 2Lebowitzcomics
 
Comandos spanning tree
Comandos spanning treeComandos spanning tree
Comandos spanning tree1 2d
 
Open badgesmarch2014
Open badgesmarch2014Open badgesmarch2014
Open badgesmarch2014Martin Cooke
 
Newsletter nr 11_noiembrie_2014
Newsletter nr 11_noiembrie_2014Newsletter nr 11_noiembrie_2014
Newsletter nr 11_noiembrie_2014Vochescu Alexandru
 
Microsoft® Outlook® Tips Hints For Admins
Microsoft® Outlook® Tips Hints For AdminsMicrosoft® Outlook® Tips Hints For Admins
Microsoft® Outlook® Tips Hints For Adminspses12
 
July, 2014 Vol. 18 No.3
July, 2014 Vol. 18 No.3July, 2014 Vol. 18 No.3
July, 2014 Vol. 18 No.3Monica Sharma
 
Changes to SNS, VIS & BARD
Changes to SNS, VIS & BARDChanges to SNS, VIS & BARD
Changes to SNS, VIS & BARDNASBLA
 
Tep business planning in tourism
Tep   business planning in tourismTep   business planning in tourism
Tep business planning in tourismled4lgus
 
How2Recycle Label Presentation
How2Recycle Label PresentationHow2Recycle Label Presentation
How2Recycle Label PresentationGreenBlue
 
ARIN Registration Services Department Report
ARIN Registration Services Department ReportARIN Registration Services Department Report
ARIN Registration Services Department ReportARIN
 

Destaque (20)

Semiclassical mechanics of a non-integrable spin cluster
Semiclassical mechanics of a non-integrable spin clusterSemiclassical mechanics of a non-integrable spin cluster
Semiclassical mechanics of a non-integrable spin cluster
 
Journalism and the Semantic Web
Journalism and the Semantic WebJournalism and the Semantic Web
Journalism and the Semantic Web
 
Diploma Supplement_1
Diploma Supplement_1Diploma Supplement_1
Diploma Supplement_1
 
Resume Jyoti Menon
Resume Jyoti MenonResume Jyoti Menon
Resume Jyoti Menon
 
How to Trace an E-mail Part 2
How to Trace an E-mail Part 2How to Trace an E-mail Part 2
How to Trace an E-mail Part 2
 
Comandos spanning tree
Comandos spanning treeComandos spanning tree
Comandos spanning tree
 
Open badgesmarch2014
Open badgesmarch2014Open badgesmarch2014
Open badgesmarch2014
 
Newsletter nr 11_noiembrie_2014
Newsletter nr 11_noiembrie_2014Newsletter nr 11_noiembrie_2014
Newsletter nr 11_noiembrie_2014
 
Test title
Test titleTest title
Test title
 
Microsoft® Outlook® Tips Hints For Admins
Microsoft® Outlook® Tips Hints For AdminsMicrosoft® Outlook® Tips Hints For Admins
Microsoft® Outlook® Tips Hints For Admins
 
2010 DOE Directory
2010 DOE Directory2010 DOE Directory
2010 DOE Directory
 
Uma sec council_june_22_v4
Uma sec council_june_22_v4Uma sec council_june_22_v4
Uma sec council_june_22_v4
 
July, 2014 Vol. 18 No.3
July, 2014 Vol. 18 No.3July, 2014 Vol. 18 No.3
July, 2014 Vol. 18 No.3
 
What is doe level 6
What is doe level 6What is doe level 6
What is doe level 6
 
Innovation & Marketing at 50+
Innovation & Marketing at 50+Innovation & Marketing at 50+
Innovation & Marketing at 50+
 
Changes to SNS, VIS & BARD
Changes to SNS, VIS & BARDChanges to SNS, VIS & BARD
Changes to SNS, VIS & BARD
 
Tep business planning in tourism
Tep   business planning in tourismTep   business planning in tourism
Tep business planning in tourism
 
USER & USAGE GEO.ADMIN.CH (OKCon 2013)
USER & USAGE GEO.ADMIN.CH (OKCon 2013)USER & USAGE GEO.ADMIN.CH (OKCon 2013)
USER & USAGE GEO.ADMIN.CH (OKCon 2013)
 
How2Recycle Label Presentation
How2Recycle Label PresentationHow2Recycle Label Presentation
How2Recycle Label Presentation
 
ARIN Registration Services Department Report
ARIN Registration Services Department ReportARIN Registration Services Department Report
ARIN Registration Services Department Report
 

Semelhante a Arxiv.org: Research And Development Directions

Learning Management System
Learning Management SystemLearning Management System
Learning Management SystemShubham Singh
 
Federated Access Management 102
Federated Access Management 102Federated Access Management 102
Federated Access Management 102JISC.AM
 
McShibboleth Presentation
McShibboleth PresentationMcShibboleth Presentation
McShibboleth PresentationJISC.AM
 
JISC License Workshop
JISC License WorkshopJISC License Workshop
JISC License WorkshopJISC.AM
 
Triage in the Digital Age, by Mary Beth Weber and Gracemary Smulewitz
Triage in the Digital Age, by Mary Beth Weber and Gracemary Smulewitz Triage in the Digital Age, by Mary Beth Weber and Gracemary Smulewitz
Triage in the Digital Age, by Mary Beth Weber and Gracemary Smulewitz Charleston Conference
 
Leicester Research Archive (LRA): the work of a repository administrator
Leicester Research Archive (LRA): the work of a repository administratorLeicester Research Archive (LRA): the work of a repository administrator
Leicester Research Archive (LRA): the work of a repository administratorGaz Johnson
 
Access Management for Libraries by John Paschoud & Masha Garibyan
Access Management for Libraries by John Paschoud & Masha GaribyanAccess Management for Libraries by John Paschoud & Masha Garibyan
Access Management for Libraries by John Paschoud & Masha GaribyanJISC.AM
 
A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...lucenerevolution
 
Lucene solrrev documentlevelsecurity_rajanimaski_final
Lucene solrrev documentlevelsecurity_rajanimaski_finalLucene solrrev documentlevelsecurity_rajanimaski_final
Lucene solrrev documentlevelsecurity_rajanimaski_finalRajani Maski
 
OpenAthens Conference 2018 - Trevor Hough - Case study - University of Leeds
OpenAthens Conference 2018 - Trevor Hough - Case study - University of LeedsOpenAthens Conference 2018 - Trevor Hough - Case study - University of Leeds
OpenAthens Conference 2018 - Trevor Hough - Case study - University of LeedsOpenAthens
 
How Do We Measure Success In Digital Repositories
How Do We Measure Success In Digital RepositoriesHow Do We Measure Success In Digital Repositories
How Do We Measure Success In Digital RepositoriesRichard Bernier
 
Vision and Scope Document For Library Management System
Vision and Scope Document For Library Management SystemVision and Scope Document For Library Management System
Vision and Scope Document For Library Management SystemSoman Sarim
 
Federated Access Management (SFEU)
Federated Access Management (SFEU)Federated Access Management (SFEU)
Federated Access Management (SFEU)JISC.AM
 
Partnering With Vendors to Limit Compromised User Accounts - Richard Guajardo
Partnering With Vendors to Limit Compromised User Accounts - Richard GuajardoPartnering With Vendors to Limit Compromised User Accounts - Richard Guajardo
Partnering With Vendors to Limit Compromised User Accounts - Richard GuajardoNASIG
 
Individual e journal subscription: assembly required
Individual e journal subscription: assembly requiredIndividual e journal subscription: assembly required
Individual e journal subscription: assembly requiredxqhiris
 
Simple Web service Offering Repository Deposit (SWORD)‏
Simple Web service Offering Repository Deposit (SWORD)‏Simple Web service Offering Repository Deposit (SWORD)‏
Simple Web service Offering Repository Deposit (SWORD)‏Julie Allinson
 
HathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsHathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsBeth Plale
 
library management system
library management systemlibrary management system
library management systemprabhat kumar
 

Semelhante a Arxiv.org: Research And Development Directions (20)

E-library mangament system
E-library mangament systemE-library mangament system
E-library mangament system
 
Learning Management System
Learning Management SystemLearning Management System
Learning Management System
 
Federated Access Management 102
Federated Access Management 102Federated Access Management 102
Federated Access Management 102
 
McShibboleth Presentation
McShibboleth PresentationMcShibboleth Presentation
McShibboleth Presentation
 
JISC License Workshop
JISC License WorkshopJISC License Workshop
JISC License Workshop
 
Triage in the Digital Age, by Mary Beth Weber and Gracemary Smulewitz
Triage in the Digital Age, by Mary Beth Weber and Gracemary Smulewitz Triage in the Digital Age, by Mary Beth Weber and Gracemary Smulewitz
Triage in the Digital Age, by Mary Beth Weber and Gracemary Smulewitz
 
Leicester Research Archive (LRA): the work of a repository administrator
Leicester Research Archive (LRA): the work of a repository administratorLeicester Research Archive (LRA): the work of a repository administrator
Leicester Research Archive (LRA): the work of a repository administrator
 
Access Management for Libraries by John Paschoud & Masha Garibyan
Access Management for Libraries by John Paschoud & Masha GaribyanAccess Management for Libraries by John Paschoud & Masha Garibyan
Access Management for Libraries by John Paschoud & Masha Garibyan
 
A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...
 
Lucene solrrev documentlevelsecurity_rajanimaski_final
Lucene solrrev documentlevelsecurity_rajanimaski_finalLucene solrrev documentlevelsecurity_rajanimaski_final
Lucene solrrev documentlevelsecurity_rajanimaski_final
 
OpenAthens Conference 2018 - Trevor Hough - Case study - University of Leeds
OpenAthens Conference 2018 - Trevor Hough - Case study - University of LeedsOpenAthens Conference 2018 - Trevor Hough - Case study - University of Leeds
OpenAthens Conference 2018 - Trevor Hough - Case study - University of Leeds
 
How Do We Measure Success In Digital Repositories
How Do We Measure Success In Digital RepositoriesHow Do We Measure Success In Digital Repositories
How Do We Measure Success In Digital Repositories
 
Vision and Scope Document For Library Management System
Vision and Scope Document For Library Management SystemVision and Scope Document For Library Management System
Vision and Scope Document For Library Management System
 
Federated Access Management (SFEU)
Federated Access Management (SFEU)Federated Access Management (SFEU)
Federated Access Management (SFEU)
 
Partnering With Vendors to Limit Compromised User Accounts - Richard Guajardo
Partnering With Vendors to Limit Compromised User Accounts - Richard GuajardoPartnering With Vendors to Limit Compromised User Accounts - Richard Guajardo
Partnering With Vendors to Limit Compromised User Accounts - Richard Guajardo
 
Individual e journal subscription: assembly required
Individual e journal subscription: assembly requiredIndividual e journal subscription: assembly required
Individual e journal subscription: assembly required
 
Simple Web service Offering Repository Deposit (SWORD)‏
Simple Web service Offering Repository Deposit (SWORD)‏Simple Web service Offering Repository Deposit (SWORD)‏
Simple Web service Offering Repository Deposit (SWORD)‏
 
HathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsHathiTrust Research Center Secure Commons
HathiTrust Research Center Secure Commons
 
library management system
library management systemlibrary management system
library management system
 
Celsius Bloodhound: Automatizing searching and fetching records from library ...
Celsius Bloodhound: Automatizing searching and fetching records from library ...Celsius Bloodhound: Automatizing searching and fetching records from library ...
Celsius Bloodhound: Automatizing searching and fetching records from library ...
 

Mais de Paul Houle

Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Paul Houle
 
Estimating the Software Product Value during the Development Process
Estimating the Software Product Value during the Development ProcessEstimating the Software Product Value during the Development Process
Estimating the Software Product Value during the Development ProcessPaul Houle
 
Universal Standards for LEI and other Corporate Reference Data: Enabling risk...
Universal Standards for LEI and other Corporate Reference Data: Enabling risk...Universal Standards for LEI and other Corporate Reference Data: Enabling risk...
Universal Standards for LEI and other Corporate Reference Data: Enabling risk...Paul Houle
 
Fixing a leaky bucket; Observations on the Global LEI System
Fixing a leaky bucket; Observations on the Global LEI SystemFixing a leaky bucket; Observations on the Global LEI System
Fixing a leaky bucket; Observations on the Global LEI SystemPaul Houle
 
Cisco Fog Strategy For Big and Smart Data
Cisco Fog Strategy For Big and Smart DataCisco Fog Strategy For Big and Smart Data
Cisco Fog Strategy For Big and Smart DataPaul Houle
 
Making the semantic web work
Making the semantic web workMaking the semantic web work
Making the semantic web workPaul Houle
 
Ontology2 platform
Ontology2 platformOntology2 platform
Ontology2 platformPaul Houle
 
Ontology2 Platform Evolution
Ontology2 Platform EvolutionOntology2 Platform Evolution
Ontology2 Platform EvolutionPaul Houle
 
Paul houle the supermen
Paul houle   the supermenPaul houle   the supermen
Paul houle the supermenPaul Houle
 
Paul houle what ails enterprise search
Paul houle   what ails enterprise search Paul houle   what ails enterprise search
Paul houle what ails enterprise search Paul Houle
 
Subjective Importance Smackdown
Subjective Importance SmackdownSubjective Importance Smackdown
Subjective Importance SmackdownPaul Houle
 
Extension methods, nulls, namespaces and precedence in c#
Extension methods, nulls, namespaces and precedence in c#Extension methods, nulls, namespaces and precedence in c#
Extension methods, nulls, namespaces and precedence in c#Paul Houle
 
Dropping unique constraints in sql server
Dropping unique constraints in sql serverDropping unique constraints in sql server
Dropping unique constraints in sql serverPaul Houle
 
Prefix casting versus as-casting in c#
Prefix casting versus as-casting in c#Prefix casting versus as-casting in c#
Prefix casting versus as-casting in c#Paul Houle
 
Paul houle resume
Paul houle resumePaul houle resume
Paul houle resumePaul Houle
 
Keeping track of state in asynchronous callbacks
Keeping track of state in asynchronous callbacksKeeping track of state in asynchronous callbacks
Keeping track of state in asynchronous callbacksPaul Houle
 
Embrace dynamic PHP
Embrace dynamic PHPEmbrace dynamic PHP
Embrace dynamic PHPPaul Houle
 
Once asynchronous, always asynchronous
Once asynchronous, always asynchronousOnce asynchronous, always asynchronous
Once asynchronous, always asynchronousPaul Houle
 
What do you do when you’ve caught an exception?
What do you do when you’ve caught an exception?What do you do when you’ve caught an exception?
What do you do when you’ve caught an exception?Paul Houle
 
Extension methods, nulls, namespaces and precedence in c#
Extension methods, nulls, namespaces and precedence in c#Extension methods, nulls, namespaces and precedence in c#
Extension methods, nulls, namespaces and precedence in c#Paul Houle
 

Mais de Paul Houle (20)

Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6
 
Estimating the Software Product Value during the Development Process
Estimating the Software Product Value during the Development ProcessEstimating the Software Product Value during the Development Process
Estimating the Software Product Value during the Development Process
 
Universal Standards for LEI and other Corporate Reference Data: Enabling risk...
Universal Standards for LEI and other Corporate Reference Data: Enabling risk...Universal Standards for LEI and other Corporate Reference Data: Enabling risk...
Universal Standards for LEI and other Corporate Reference Data: Enabling risk...
 
Fixing a leaky bucket; Observations on the Global LEI System
Fixing a leaky bucket; Observations on the Global LEI SystemFixing a leaky bucket; Observations on the Global LEI System
Fixing a leaky bucket; Observations on the Global LEI System
 
Cisco Fog Strategy For Big and Smart Data
Cisco Fog Strategy For Big and Smart DataCisco Fog Strategy For Big and Smart Data
Cisco Fog Strategy For Big and Smart Data
 
Making the semantic web work
Making the semantic web workMaking the semantic web work
Making the semantic web work
 
Ontology2 platform
Ontology2 platformOntology2 platform
Ontology2 platform
 
Ontology2 Platform Evolution
Ontology2 Platform EvolutionOntology2 Platform Evolution
Ontology2 Platform Evolution
 
Paul houle the supermen
Paul houle   the supermenPaul houle   the supermen
Paul houle the supermen
 
Paul houle what ails enterprise search
Paul houle   what ails enterprise search Paul houle   what ails enterprise search
Paul houle what ails enterprise search
 
Subjective Importance Smackdown
Subjective Importance SmackdownSubjective Importance Smackdown
Subjective Importance Smackdown
 
Extension methods, nulls, namespaces and precedence in c#
Extension methods, nulls, namespaces and precedence in c#Extension methods, nulls, namespaces and precedence in c#
Extension methods, nulls, namespaces and precedence in c#
 
Dropping unique constraints in sql server
Dropping unique constraints in sql serverDropping unique constraints in sql server
Dropping unique constraints in sql server
 
Prefix casting versus as-casting in c#
Prefix casting versus as-casting in c#Prefix casting versus as-casting in c#
Prefix casting versus as-casting in c#
 
Paul houle resume
Paul houle resumePaul houle resume
Paul houle resume
 
Keeping track of state in asynchronous callbacks
Keeping track of state in asynchronous callbacksKeeping track of state in asynchronous callbacks
Keeping track of state in asynchronous callbacks
 
Embrace dynamic PHP
Embrace dynamic PHPEmbrace dynamic PHP
Embrace dynamic PHP
 
Once asynchronous, always asynchronous
Once asynchronous, always asynchronousOnce asynchronous, always asynchronous
Once asynchronous, always asynchronous
 
What do you do when you’ve caught an exception?
What do you do when you’ve caught an exception?What do you do when you’ve caught an exception?
What do you do when you’ve caught an exception?
 
Extension methods, nulls, namespaces and precedence in c#
Extension methods, nulls, namespaces and precedence in c#Extension methods, nulls, namespaces and precedence in c#
Extension methods, nulls, namespaces and precedence in c#
 

Último

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 

Último (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 

Arxiv.org: Research And Development Directions

  • 1. ArXiv.org 250,000 documents 47,000 registered users 1 million+ downloads per year Cost Per Paper $10000 Commercial Journal $1000 Non-Profit Journal $10 arXiv
  • 2. Goal: Process increasing number of submissions at constant or declining cost
  • 3. arXiv has an active core of users: 10% of users are responsible for about 1/3 of all submissions, 50% of all users have logged in (to submit or update a paper) in the past 1.5 years
  • 4. Authentication and Access Control Recently moved from an http authentication/Berkeley database system to a system based on cookies and a relational database. Currently, all registered users (who haven’t been suspended) can submit to all subjects classes in all archives – the original submitter or somebody with the paper password can update the paper. People are allowed to register depending on their E-mail address: abc@university.edu can register, but xyz@company.com can’t unless company=ibm,lucent,…; this list is hard to maintain (we have to block popular ISPs in every country), exceptions are dealt with manually at great cost (each case takes detective work), and there are many people in .edu (alumni, non-research staff) who shouldn’t be able to submit. Because registration and submission are linked, user database can’t be used to offer other services: e-mail notification, personalization.
  • 5. Endorsements and Trust Management Administrators Grandfathered Users In new system, everyone will be able to register. Users who registered under the old system will still be able to upload to any archive or subject class, but new users will need to be endorsed by an author with a publication history in that category. Burden shifts from one senior staff person to 47,000 registered users. User database can be used
  • 7. Web-based interface for administrators: • View user history and publications • Monitor endorsement process • Manage authority records • Disable ability to submit or endorse • Keep “institutional memory”
  • 8. Future Directions •Flexible Submission Queue (Currently submissions are published the following evening – we can’t easily delay a submission) •Validating Metadata Form (Force users to clean up entry errors, so administrators don’t have to) • Automatic Protection (Suspicious submissions and endorsements will be automatically delayed) • New Search Engine based on Lucene • Retrofit e-mail notification (current awareness) to use new user database.
  • 9. Classifying Articles with the Support Vector Machine Paul Ginsparg Paul Houle Thorsten Joachims Jae-Hoon Sul Goal: identify papers in existing archives that are relevant to a new subject archive, q-bio (Quantitative Biology)
  • 10. Active Training of SVM Training: q-bio Training: not q-bio Other far from margin Other close to margin SVM finds maximum-margin hyperplane. We do first training run on one year of data, then identify other papers that lie close to the dividing line. We iteratively classify these by hand to refine the classification
  • 11.
  • 12. Classifer performance improves as the size of a category increases.
  • 13.
  • 14. Time Series Analysis of Content and Usage Information Paul Ginsparg Jon Kleinberg
  • 15. Kleinberg’s algorithm uses a hidden Markov model to detect bursts of word usage in arXiv titles, reveals intellectual trends in the last decade of high-energy physics theory.
  • 16. Announcement Cited by other papers Web Link Added Review papers have a distinctive pattern of use: an initial spike after announcement, followed by a long nearly-constant tail.

Notas do Editor

  1. {}