SlideShare uma empresa Scribd logo
1 de 12
ANTCONC
Design and Development of
a Freeware Corpus
Analysis
Toolkit for the Technical
Writing Classroom
ABSTRACT
• AntConc is a freeware, multi-platform, and multi-purpose corpus analysis
toolkit, designed by the author for specific use in the classroom.
• It includes a powerful concordancer, word and keyword frequency
generators, tools for cluster and lexical bundle analysis, and a word distribution
plot.
• It also offers the choice of simple wildcard searches or powerful regular
expression searches, and has an extremely easy-to-use, intuitive interface.
BACKGROUND
• AntConc was first released in 2002. At the time, it was a simple KWIC (Key Word
in context) concordancer program designed for use by over 700 students in a
scientific and technical writing course at Osaka University Graduate School of
Engineering.
• It was developed in a Windows environment using the PERL 5.8 programming
language, and the graphical user interface (GUI) was developed using the
PERL/TK 8.0 toolkit. This enabled the program to be easily ported to a
Linux/Unix environment, which was necessary as the course was initially taught
in a Linux based CALL (Computer Assisted Language Learning) laboratory
before being moved to a Windows based CALL laboratory the following year.
CONCORDANCER TOOL
• The central tool used in most corpus analysis software, including
AntConc, is the concordancer.
• As Sun & Wang described that the concordancers have been shown to be
an effective aid in the acquisition of a second or foreign
language, facilitating the learning of vocabulary, collocations, grammar
and writing styles.
• Research has shown that new vocabulary can only be acquired through
meeting words in diverse natural contexts and in varied situations.
• A concordance program can find and display a huge number of examples
in varied contexts and situations quickly and efficiently using a reasonably
large corpus.
TOOLS AND
FEATURES IN
ANTCONC
Multiplatform
– Windows 95 or
later
– Unix / Linux
Extensive set of text
analysis tools
– KWIC Concordance
– Search Term Distribution
Plot
– Original File View
– Word Clusters / Lexical
Bundles
– Word lists
– Keyword list
Powerful Search
Features
– Regular
Expressions (REGEX)
– Extensive
Wildcards
Multiple-Level
Sorting
Freeware
License Small memory
requirement
(~2 MB of disk space)
Easy-to-
use, intuitive
GUI
Unicode Support
HTML/XML Tag
Handling
• The Concordancer Tool of AntConc has a wide range of features that make it an
effective tool not only for learners, but also teachers and researchers.
• The features are:
1. Search terms can be either substrings, words, or
phrases, and can be either case sensitive or
insensitive. Embedded with a wide
range of wildcards that the user can assign to any
particular character or string of characters
2. Search terms can be defined as full regular
expressions (REGEX), offering the user access to
extremely powerful and complex searches
3. Three levels of sorting of KWIC (Key Word in
Context) lines are possible, with user definable
highlight colours at each level.
4. If a user clicks on any search term in the KWIC
results display, the program will automatically
open the View Files tool (described later) and
show the search term hit embedded in the original
file.
Concordance Search Term Plot Tool
• The main purpose of the Concordancer Tool is to show how a search term is used in
a target corpus.
• The Concordance Search Term Plot Tool offers the same functionality as the
Concordancer Tool in terms of search term options but the results are displayed in a
quite different way.
• An effective aid, for example in determining where phrases such as ‘we’ or ‘in this
paper’ are used in research articles, or determining which research articles use a
particular keyword or phrase.
5. The KWIC results display is divided into
columns, in which the hit number, KWIC line, and
file name are shown separately. Each column can be either
displayed or
hidden, and standard selection methods can be
used to save data in the columns or rows to the
clipboard or a text file
VIEW FILES TOOL
• When a user clicks on a search term in the results display of the
Concordancer Tool, the View Files Tool is used in order to display the
search term in the original file.
• The View Files Tool can be used independently to search for any
substring, word, phrase or regular expression in a target file, offering the
user a very powerful text search engine.
• All resulting hits are displayed in a user-definable highlight colour and
buttons and keyboard shortcuts can be used to jump to a specified hit
anywhere in the file.
• All KWIC lines based on the term are automatically shown using the
Concordancer Tool if the users click on one of the highlighted search terms.
Word List / Keyword List Tools
• Word lists are useful as they suggest interesting areas for investigation and
highlight problem area in a corpus.
• Bowker & Person described how word lists can also be used to find families
of related word forms and lemmas in a corpus.
• Hockey states that an ideal word list generation program should be able to
sort words into alphabetical or frequency order.
• Users can specify the reverse of a stop list example a list of only the words
that should be counted and these can be specified either by direct input
from the keyboard or from a separate file.
• The Keywords Tool operates in an almost identical way to the Key Words
Tool in Word Smith Tool calculating the ‘keyness’ of words using and offering
the user the option of displaying or hiding unusually infrequent key words.
WORD CLUSTERS/BUNDLES
TOOL
• In AntConc, multi-word units can be investigated USING THE Word
Clusters Tool since this tool can displays clusters of words centred on a
search term and orders them alphabetically or by frequency.
• The search terms can be specified as a substring, word, phrase or regular
expression as in the Concordancer, Plot and View File Tools and the
number of additional words to the left and right of the search term can also
be specified.
• AntConc includes lexical bundle searches as an option in the Word Clusters
Tool and calculating all the lexical bundles for a particular set of criteria
can take a great deal of time. Therefore, as in all other tools in the
program, the processing can be halted by clicking on the ‘Stop’ button at
any time.
LIMITATIONS OF ANTCONC
• Concordancers can be divided into two main types which is:-
1) those that first build an index which is used for subsequent search
operations
2) those that act directly on the raw text
• The first of these has the advantages that they can operate on large corpora but tend
to be less flexible than the second type.
• AntConc fits into the second category, performing all processing on the raw data
files, and storing results in active memory.
• Most corpus analysis programs offer users the ability to see the collocates of a
search time in a table, where the frequency of the most common words to the left or
right of the search term are indicated.
• One of the weakest areas of AntConc is in the handling of annotated data such as
data encoded in HTML/XML format.
FUTURE DEVELOPMENTS
• The first improvement will be a redesign of the View Files Tool making it
operate with far greater speed and the current tool is able to handle files
with ambiguous line endings but this comes with a heavy loss in speed.
• The next release will also include a tool to view collocates, and the ability
to sort word lists alphabetically from both the beginning and end of
words, which is a feature is a feature recommended by Hockey.
• AntConc will be improved to handle annotated data, in particular XML, in
a much more powerful and intuitive way and it also includes header
definitions that if extracted, can be used as part of search criteria.
• A detailed user manual and accompanying tutorial video are planned for the
software, where the operation of each tool will be explained with concrete
examples and a step-by-step guide.

Mais conteúdo relacionado

Mais procurados

Mentalist and Behaviorist Theory of SLA
Mentalist and Behaviorist Theory of SLAMentalist and Behaviorist Theory of SLA
Mentalist and Behaviorist Theory of SLAWenlie Jean
 
General Linguistics By Rabia
General Linguistics By RabiaGeneral Linguistics By Rabia
General Linguistics By RabiaDr. Cupid Lucid
 
British national corpus
British national corpusBritish national corpus
British national corpusLaura P
 
Introduction to syntax
Introduction to syntaxIntroduction to syntax
Introduction to syntaxFarjana Ela
 
Transformational generative grammar
Transformational generative grammarTransformational generative grammar
Transformational generative grammarAliImran376
 
History of applied linguistic
History of applied linguisticHistory of applied linguistic
History of applied linguisticethan Lim
 
The Prague School.ppt
The Prague School.pptThe Prague School.ppt
The Prague School.pptnaheed29
 
Corpus linguistics in language learning
Corpus linguistics in language learningCorpus linguistics in language learning
Corpus linguistics in language learningnfuadah123
 
Translation loss and gain
Translation loss and gainTranslation loss and gain
Translation loss and gainAngelito Pera
 
Neurolinguistics
Neurolinguistics Neurolinguistics
Neurolinguistics PS Deb
 
General linguistics
General linguisticsGeneral linguistics
General linguisticszhian asaad
 

Mais procurados (20)

Mentalist and Behaviorist Theory of SLA
Mentalist and Behaviorist Theory of SLAMentalist and Behaviorist Theory of SLA
Mentalist and Behaviorist Theory of SLA
 
General Linguistics By Rabia
General Linguistics By RabiaGeneral Linguistics By Rabia
General Linguistics By Rabia
 
Saussure
Saussure Saussure
Saussure
 
British national corpus
British national corpusBritish national corpus
British national corpus
 
Lexicology
LexicologyLexicology
Lexicology
 
Forensic Linguistics
Forensic LinguisticsForensic Linguistics
Forensic Linguistics
 
Lexicology
LexicologyLexicology
Lexicology
 
Introduction to syntax
Introduction to syntaxIntroduction to syntax
Introduction to syntax
 
Lexical semantics
Lexical semanticsLexical semantics
Lexical semantics
 
Transformational generative grammar
Transformational generative grammarTransformational generative grammar
Transformational generative grammar
 
History of applied linguistic
History of applied linguisticHistory of applied linguistic
History of applied linguistic
 
Syllabus Designing
Syllabus DesigningSyllabus Designing
Syllabus Designing
 
The Prague School.ppt
The Prague School.pptThe Prague School.ppt
The Prague School.ppt
 
Inter-language theory
Inter-language theoryInter-language theory
Inter-language theory
 
Corpus linguistics in language learning
Corpus linguistics in language learningCorpus linguistics in language learning
Corpus linguistics in language learning
 
Translation loss and gain
Translation loss and gainTranslation loss and gain
Translation loss and gain
 
Styles & registers
Styles & registersStyles & registers
Styles & registers
 
Neurolinguistics
Neurolinguistics Neurolinguistics
Neurolinguistics
 
GTM method
GTM method GTM method
GTM method
 
General linguistics
General linguisticsGeneral linguistics
General linguistics
 

Semelhante a Antconc

Ant conc ~design & development of a freeware
Ant conc ~design & development of a freewareAnt conc ~design & development of a freeware
Ant conc ~design & development of a freewaresarahannelazarus
 
Corpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical ToolsCorpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical ToolsJitendra Patil
 
Research Tool - End Note
Research Tool - End NoteResearch Tool - End Note
Research Tool - End Noteador
 
Automatic Term Recognition with Apache Solr
Automatic Term Recognition with Apache SolrAutomatic Term Recognition with Apache Solr
Automatic Term Recognition with Apache SolrJIE GAO
 
compiler construction tool in computer science .
compiler construction tool in computer science .compiler construction tool in computer science .
compiler construction tool in computer science .RanitHalder
 
Functional Requirements for an Interlinear Text Editor
Functional Requirements for an Interlinear Text EditorFunctional Requirements for an Interlinear Text Editor
Functional Requirements for an Interlinear Text EditorBaden Hughes
 
2010 tool forum ata handout
2010 tool forum ata handout2010 tool forum ata handout
2010 tool forum ata handoutascetlan
 
The recommendations system for source code components retrieval
The recommendations system for source code components retrievalThe recommendations system for source code components retrieval
The recommendations system for source code components retrievalAYESHA JAVED
 

Semelhante a Antconc (20)

Ant conc notes
Ant conc notesAnt conc notes
Ant conc notes
 
Ant conc ~design & development of a freeware
Ant conc ~design & development of a freewareAnt conc ~design & development of a freeware
Ant conc ~design & development of a freeware
 
Skbp 1023 introduction to antconc
Skbp 1023 introduction to antconcSkbp 1023 introduction to antconc
Skbp 1023 introduction to antconc
 
Antconc
AntconcAntconc
Antconc
 
methods and resources
methods and resourcesmethods and resources
methods and resources
 
Corpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical ToolsCorpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical Tools
 
 
Research Tool - End Note
Research Tool - End NoteResearch Tool - End Note
Research Tool - End Note
 
Automatic Term Recognition with Apache Solr
Automatic Term Recognition with Apache SolrAutomatic Term Recognition with Apache Solr
Automatic Term Recognition with Apache Solr
 
Concordances
Concordances Concordances
Concordances
 
My Developments
My DevelopmentsMy Developments
My Developments
 
compiler construction tool in computer science .
compiler construction tool in computer science .compiler construction tool in computer science .
compiler construction tool in computer science .
 
Functional Requirements for an Interlinear Text Editor
Functional Requirements for an Interlinear Text EditorFunctional Requirements for an Interlinear Text Editor
Functional Requirements for an Interlinear Text Editor
 
2010 tool forum ata handout
2010 tool forum ata handout2010 tool forum ata handout
2010 tool forum ata handout
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
 
The recommendations system for source code components retrieval
The recommendations system for source code components retrievalThe recommendations system for source code components retrieval
The recommendations system for source code components retrieval
 
TM Town - TAUS Tokyo Forum 2015
TM Town - TAUS Tokyo Forum 2015TM Town - TAUS Tokyo Forum 2015
TM Town - TAUS Tokyo Forum 2015
 
CALICO 2010 Workshop
CALICO 2010  Workshop CALICO 2010  Workshop
CALICO 2010 Workshop
 
Unit1
Unit1Unit1
Unit1
 
Unit1
Unit1Unit1
Unit1
 

Último

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Último (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

Antconc

  • 1. ANTCONC Design and Development of a Freeware Corpus Analysis Toolkit for the Technical Writing Classroom
  • 2. ABSTRACT • AntConc is a freeware, multi-platform, and multi-purpose corpus analysis toolkit, designed by the author for specific use in the classroom. • It includes a powerful concordancer, word and keyword frequency generators, tools for cluster and lexical bundle analysis, and a word distribution plot. • It also offers the choice of simple wildcard searches or powerful regular expression searches, and has an extremely easy-to-use, intuitive interface.
  • 3. BACKGROUND • AntConc was first released in 2002. At the time, it was a simple KWIC (Key Word in context) concordancer program designed for use by over 700 students in a scientific and technical writing course at Osaka University Graduate School of Engineering. • It was developed in a Windows environment using the PERL 5.8 programming language, and the graphical user interface (GUI) was developed using the PERL/TK 8.0 toolkit. This enabled the program to be easily ported to a Linux/Unix environment, which was necessary as the course was initially taught in a Linux based CALL (Computer Assisted Language Learning) laboratory before being moved to a Windows based CALL laboratory the following year.
  • 4. CONCORDANCER TOOL • The central tool used in most corpus analysis software, including AntConc, is the concordancer. • As Sun & Wang described that the concordancers have been shown to be an effective aid in the acquisition of a second or foreign language, facilitating the learning of vocabulary, collocations, grammar and writing styles. • Research has shown that new vocabulary can only be acquired through meeting words in diverse natural contexts and in varied situations. • A concordance program can find and display a huge number of examples in varied contexts and situations quickly and efficiently using a reasonably large corpus.
  • 5. TOOLS AND FEATURES IN ANTCONC Multiplatform – Windows 95 or later – Unix / Linux Extensive set of text analysis tools – KWIC Concordance – Search Term Distribution Plot – Original File View – Word Clusters / Lexical Bundles – Word lists – Keyword list Powerful Search Features – Regular Expressions (REGEX) – Extensive Wildcards Multiple-Level Sorting Freeware License Small memory requirement (~2 MB of disk space) Easy-to- use, intuitive GUI Unicode Support HTML/XML Tag Handling
  • 6. • The Concordancer Tool of AntConc has a wide range of features that make it an effective tool not only for learners, but also teachers and researchers. • The features are: 1. Search terms can be either substrings, words, or phrases, and can be either case sensitive or insensitive. Embedded with a wide range of wildcards that the user can assign to any particular character or string of characters 2. Search terms can be defined as full regular expressions (REGEX), offering the user access to extremely powerful and complex searches 3. Three levels of sorting of KWIC (Key Word in Context) lines are possible, with user definable highlight colours at each level. 4. If a user clicks on any search term in the KWIC results display, the program will automatically open the View Files tool (described later) and show the search term hit embedded in the original file.
  • 7. Concordance Search Term Plot Tool • The main purpose of the Concordancer Tool is to show how a search term is used in a target corpus. • The Concordance Search Term Plot Tool offers the same functionality as the Concordancer Tool in terms of search term options but the results are displayed in a quite different way. • An effective aid, for example in determining where phrases such as ‘we’ or ‘in this paper’ are used in research articles, or determining which research articles use a particular keyword or phrase. 5. The KWIC results display is divided into columns, in which the hit number, KWIC line, and file name are shown separately. Each column can be either displayed or hidden, and standard selection methods can be used to save data in the columns or rows to the clipboard or a text file
  • 8. VIEW FILES TOOL • When a user clicks on a search term in the results display of the Concordancer Tool, the View Files Tool is used in order to display the search term in the original file. • The View Files Tool can be used independently to search for any substring, word, phrase or regular expression in a target file, offering the user a very powerful text search engine. • All resulting hits are displayed in a user-definable highlight colour and buttons and keyboard shortcuts can be used to jump to a specified hit anywhere in the file. • All KWIC lines based on the term are automatically shown using the Concordancer Tool if the users click on one of the highlighted search terms.
  • 9. Word List / Keyword List Tools • Word lists are useful as they suggest interesting areas for investigation and highlight problem area in a corpus. • Bowker & Person described how word lists can also be used to find families of related word forms and lemmas in a corpus. • Hockey states that an ideal word list generation program should be able to sort words into alphabetical or frequency order. • Users can specify the reverse of a stop list example a list of only the words that should be counted and these can be specified either by direct input from the keyboard or from a separate file. • The Keywords Tool operates in an almost identical way to the Key Words Tool in Word Smith Tool calculating the ‘keyness’ of words using and offering the user the option of displaying or hiding unusually infrequent key words.
  • 10. WORD CLUSTERS/BUNDLES TOOL • In AntConc, multi-word units can be investigated USING THE Word Clusters Tool since this tool can displays clusters of words centred on a search term and orders them alphabetically or by frequency. • The search terms can be specified as a substring, word, phrase or regular expression as in the Concordancer, Plot and View File Tools and the number of additional words to the left and right of the search term can also be specified. • AntConc includes lexical bundle searches as an option in the Word Clusters Tool and calculating all the lexical bundles for a particular set of criteria can take a great deal of time. Therefore, as in all other tools in the program, the processing can be halted by clicking on the ‘Stop’ button at any time.
  • 11. LIMITATIONS OF ANTCONC • Concordancers can be divided into two main types which is:- 1) those that first build an index which is used for subsequent search operations 2) those that act directly on the raw text • The first of these has the advantages that they can operate on large corpora but tend to be less flexible than the second type. • AntConc fits into the second category, performing all processing on the raw data files, and storing results in active memory. • Most corpus analysis programs offer users the ability to see the collocates of a search time in a table, where the frequency of the most common words to the left or right of the search term are indicated. • One of the weakest areas of AntConc is in the handling of annotated data such as data encoded in HTML/XML format.
  • 12. FUTURE DEVELOPMENTS • The first improvement will be a redesign of the View Files Tool making it operate with far greater speed and the current tool is able to handle files with ambiguous line endings but this comes with a heavy loss in speed. • The next release will also include a tool to view collocates, and the ability to sort word lists alphabetically from both the beginning and end of words, which is a feature is a feature recommended by Hockey. • AntConc will be improved to handle annotated data, in particular XML, in a much more powerful and intuitive way and it also includes header definitions that if extracted, can be used as part of search criteria. • A detailed user manual and accompanying tutorial video are planned for the software, where the operation of each tool will be explained with concrete examples and a step-by-step guide.