Wednesday, December 16, 2009

Australian Newspapers Beta update

The 1 millionth page is now available on Australian newspapers beta. Congratulations to the National Library of Australia. They recently reported:

"The millionth page was made publicly available on 14 December 2009, marking a project milestone. The millionth page contained the 10 millionth article. This was a 1901 edition of the Sydney Morning Herald. There will be 40 million articles available by 2011.

Digitisation started in 2007 and 4.4 million pages were targeted for digitisation over 4 years to be complete and publicly accessible as full text articles by June 2010. 3 million of the identified 4.4 million pages have been scanned from microfilm into digital images so far. Of the 3 million scanned pages 1 million have been converted into full text articles by the OCR process and are publicly available. The remaining pages will be made available from now through til June 2010. Visual progress chart:

The 1 million pages publicly available amounts to 10 million articles with coverage dates of 1803 -1954.

Public users have enhanced the data significantly since August 2008 by correcting 8.13 million lines of text in 368,390 articles. This really improves the searching. Also 5061 comments and 230,384 tags have been added to articles, which will be used for search and retrieval in the 2010 version of Trove.

2. Sydney Morning Herald

The first 70 years of the Sydney Morning Herald are now publicly available. 1831-1901.

Please be aware that some issues of the Sydney Morning Herald are missing. These are being sourced in hard copy from locations in Australia and will be added to the public service in 2010. So don't worry if you spot a missing issue, we know about it and it will appear in the service soon.

3. Public search interfaces and intergration into TROVE
There are currently 2 search interfaces available as below:
Standalone service [BLUE INTERFACE] - Australian Newspapers v1.0 :
Newspapers integrated into Trove [GREEN INTERFACE] still under development December 2009, expected to be completed end Jan 2010.
About the integration of Newspapers into Trove:
Positive feedback was received from public users in 2009 about integrated searching of newspapers with other resources, when as a trial newspapers could be searched across with other content. The new Trove service integrates searching of many different resources at once (the Australian National Bibliographic Database, Australian Newspapers, Picture Australia, Australian Research Online , PANDORA, OAIster, Open Library, the Hathi Trust, the Internet Archive and the Library of Congress tables of contents, publishers' descriptions and sample book chapters).
In December 2009 work began to integrate the standalone newspapers service into the wider Trove service. From December 2009- end January 2010 work will be undertaken in Trove so that the full functionality of the Australian Newsapers v 1.0 is duplicated. The public will be invited to give feedback before the standalone version is removed and replaced with the Trove version.
As at 14 December 2009:
Text correction, tagging and commenting histories and rankings are identical in both services and will remain so (though history and ranking tables look different in Trove). These functions can be done in either service.
Search results are different in Trove to standalone version (because tags and comments are included in the relevancy ranking in Trove).
Content is identical in both services.
If you have issues please clearly report which version you are in (Green or Blue?)
Read more about the Australian Newspapers - Trove integration here: ANDP_AN and Trove integration v1
4. Enhancements to Australian Newspapers service
In early 2010 enhancements suggested by the public in 2008 (beta feedback for the newspapers service will be implemented in Trove. This will include establishing a public forum, review of tagging functionality, enhancement to public profiles, and RSS feeds for new content.

5. ANDP website updated

The ANDP website has been updated with project information, project reports, title selection lists etc."

No comments: