Text mining

project: William Godwin's Diary

The project provides a digital edition of the diary of William Godwin (1756-1836). Godwin’s diary consists of 32 octavo notebooks. The first entry is for 6 April 1788 and the final entry is for 26 March 1836, shortly before he died. The diary is a resource of immense importance to researchers of history, politics, literature, and women’s studies. [read more]

tool: FocusOPEN Digital Asset Manager

Open source Digital Asset Management solution designed for medium size preservation, cataloguing, media archiving and batch transcoding.
Methods relating to this toolCategory
AnimationPractice-led research
Cataloguing and indexingData structuring and enhancement
Collaborative publishingData publishing and dissemination
CollatingData analysis
Content analysisData analysis
CurationStrategy and project management
Data miningData analysis
General project managementStrategy and project management
Graphical interaction (synchronous)Communication and collaboration
Graphical renderingData structuring and enhancement
Image feature measurementData analysis
Image manipulationPractice-led research
Image segmentationData analysis
IndexingData analysis
Manual input and transcriptionData capture
OverlayingData analysis
PhotographyPractice-led research
PreservationStrategy and project management
Record linkagesData structuring and enhancement
Resource sharingCommunication and collaboration
Server scriptingData publishing and dissemination
Statistical analysisData analysis
Streaming mediaData publishing and dissemination
Text encoding - presentationalData structuring and enhancement
Text encoding - referentialData structuring and enhancement
Text miningData analysis
Text recognitionData capture
Textual interaction (asynchronous)Communication and collaboration
Textual interaction (synchronous)Communication and collaboration
Use of existing digital dataData capture
User contributed contentData publishing and dissemination
Video and moving image compressionData structuring and enhancement
Video editingPractice-led research
Video post productionPractice-led research
Video-based interaction (asynchronous)Communication and collaboration
Lifecycle stage:

project: Geographies of Orthodoxy: mapping the English-Pseudo-Bonaventuran Lives of Christ, c. 1350-1550

Geographies of Orthodoxy offers a new account of an English devotional phenomenon and affective literary tradition usually characterised as ‘pseudo-Bonaventuran’ by modern commentators. Geographies of Orthodoxy proposes to examine and make openly accessible through the latest electronic means the entire material remains of the anglophone pseudo-Bonaventuran tradition. [read more]

tool: Solr

Solr is an open source enterprise search platform from the Apache Lucene project. It operates as a standalone full-text search server within an appropriate servlet container, such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language.
Features: 
Specifications:
Suite:
Platform:
Licence:
Methods relating to this toolCategory
Cataloguing and indexingData structuring and enhancement
CollatingData analysis
CollocatingData analysis
Content analysisData analysis
Data miningData analysis
IndexingData analysis
Searching and queryingData analysis
Text miningData analysis
Topic Detection and TrackingData analysis
Lifecycle stage:
Alternate tool(s):

Sphynx

tool: Lucene

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
Features: 
Specifications:
Suite:
Platform:
Licence:
Methods relating to this toolCategory
Cataloguing and indexingData structuring and enhancement
CollatingData analysis
CollocatingData analysis
Content analysisData analysis
Data miningData analysis
IndexingData analysis
ParsingData analysis
Text miningData analysis
Topic Detection and TrackingData analysis
Lifecycle stage:
Alternate tool(s):

InQuira, Verity, dtSearch, ISYS

tool: Exceed

Exceed is a PC X server system which allows for graphical user interface (GUI) interactions with networked computers. Exceed provides data exchange among applications on different platforms including UNIX, Linux, VMS, X Window Based System and IBM mainframes.
Features: 
Specifications:
Licence:
Methods relating to this toolCategory
Audio-visual interaction (synchronous)Communication and collaboration
Cataloguing and indexingData structuring and enhancement
Coding and standardisationData structuring and enhancement
Collaborative publishingData publishing and dissemination
Content analysisData analysis
Data miningData analysis
Data modellingData structuring and enhancement
General project managementStrategy and project management
Geo-referencing and projectionData structuring and enhancement
Graphical interaction (synchronous)Communication and collaboration
Graphical renderingData structuring and enhancement
PreservationStrategy and project management
Record linkagesData analysis
Record linkagesData structuring and enhancement
Resource sharingCommunication and collaboration
Resource sharingData publishing and dissemination
Searching and queryingData analysis
Text miningData analysis
Textual interaction (synchronous)Communication and collaboration
Use of existing digital dataData capture
Lifecycle stage:
Alternate tool(s):

MicroXwin, X Window System

tool: GeoParser

GeoParser is a text analysis tool that may be used to identify, tag and (where appropriate) disambiguate references to geographic location in a text resource. The tool uses Natural Language Processing to analyse the composition of a resource and identifying words that match its geographic database. The approach is useful for processing ambiguous references, such as names that may have one of several locations (e.g. Belfast in Ireland, New Zealand and Canada) and distinguishing names that may be confused with other action (e.g. Reading in Berkshire and reading as an activity). GeoParser may be used with GeoCrossWalk to tag a place name with full geographical coordinates (e.g. an OS National Grid Reference).
Features: 
Subject/tags:
Methods relating to this toolCategory
Cataloguing and indexingData structuring and enhancement
Content analysisData analysis
Data miningData analysis
DocumentationStrategy and project management
Geo-referencing and projectionData structuring and enhancement
ParsingData analysis
Searching and queryingData analysis
Spatial data analysisData analysis
Text encoding - referentialData structuring and enhancement
Text miningData analysis
Text recognitionData capture
Textual interaction (asynchronous)Communication and collaboration
Texture design and mappingPractice-led research
Lifecycle stage:
Alternate tool(s):

Metacarta’s GeoTagger, Digital Reasoning’s GeoLocator, Lockheed Martin’s AeroText, and SRA’s NetOwl

project: Embedding GeoCrossWalk

The Embedding GeoCrossWalk project sought to provide a deeper understanding of how references to place in structured texts can be researched and automatically extracted. The project aims were threefold. Firstly it sought to deploy the Geoparser tool, developed previously by the Language Technology Group of Edinburgh University's School of Informatics, to georeference the Stormont Papers, using Natural language Processing (NLP). [read more]

project: Montréal l'avenir du passé (MAP)

Montréal l'avenir du passé (MAP) was established in 2000 to create an historical GIS research infrastructure for 19th and 20th century Montréal. We have digitized six highly detailed historical maps representing all buildings in the city for 1825, 1846, 1880, 1912, 1949 and 2000. The first three and last have been geo-referenced and we have successfully "peopled" them by linking at the street-scape (1846) or lot level (1880 & 2000) census returns, tax records, city directories and a wide variety of non-routinely generated sources. [read more]

project: HESTIA

HESTIA provides a new approach towards conceptions of space in the ancient world, supported by a grant from the Arts and Humanities Research Council (AHRC). Combining a variety of different methods, it examines the ways in which space is represented in Herodotus' History, in terms of places mentioned and geographic features described. [read more]
Syndicate content