For publishers
About project
WebArchiv content
Facts
WebArchiv contains 15,5 terabytes of data. Harvesting began on 3/9/2001.
Noví partneři
8.9.2010
The following websites were recently added to WebArchiv:
Physiological Research
Savci : internetová encyklopedie savců
Jeroným Klimeš
Školní učení
Liga otevřených mužů
Výživné
Jak na doménový trh a internetové podnikání
CEMA : Central European Music Agency
In total:
2063 contracts
Novinky
Jan 8, 12:54 PM
Around the World in 2 bln pages
Institutions from over 60 countries joined Internet Archive's unique global crawl of 2 billion web pages run last year. WebArchiv contributed 707 seeds. See list of participating countries and institutions.
Nov 8, 01:37 PM
Harvest 2007
We have already finished comprehensive harvest of domestic web resources (domain .cz). The collection of the year 2007 contains 81 300 000 documents (3,6 TB).
May 2, 02:01 PM
Web Cultural Heritage
The latest issue of DCP/PADI What's new in digital preservation (no. 15) features in its Web archiving section the Web Cultural Heritage project led by us.
May 2, 01:52 PM
New thematic harvest:
Prague will bid for the 2016 Olympic games.
Mar 21, 12:41 PM
Web Cultural Heritage
The one-year project entitled Web Cultural Heritage was finished in September 2006 and the final claim assessment based on the final report and finances was approved by the Commission's financial services.
FINAL REPORT
CLT2005/A1/CZ-77 – Web Cultural Heritage
Summary:
The project “Web Cultural Heritage“ began 25th September 2005 and
ended 24th September 2006 and brought many positive results. Project team
was formed of specialists coming from four European states – Czech Republic,
Estonia, Slovenia and Slovakia.
The project was successful in a variety of areas. Many of the internet
resources (literature art, science etc.) have lasting value and significance,
and therefore they present a heritage that should be protected and preserved
for current and future generations. The aim of this project – to
create a selection guideline allowing cultural heritage institutions to select
the most important and valuable internet resources was achieved.
At first, partner institutions analysed existing methods and selection
policy in different parts of the world and created their own proposals. These
were subsequently discussed at the meetings and based on the results the final
guideline – General Recommendations were prepared. Partners appreciated
this guideline as a very useful document which can be applied in any country
and the project as helpful for their web archiving development (see Annex 1).
The technical issues were the second area of the project which resulted in
the General Recommendations concerning selection criteria. Harvesting rules had
been specified for harvesting websites using different web harvesters.
30 representative websites from each of the 4 participating countries
were selected and the total sum of 120 URLs formed the seed list for later
harvesting by three kinds of harvesting tools available – by Heritrix,
HTTrack and WebBird (incorrectly called WebCrawler in the Grant Application
which is a part of Agreement). Despite some problems with Slovenian URLs, it
was possible to compare the harvests on a website level and partly on a
mimetype level as well.
All partners worked very hard to accomplish the objectives of the project.
Everybody worked responsibly and effectively from the very beggining. This
partnership was significant as the new phase of European co-operation and
raised awareness of the need for digital preservation, i.e. the need to take
steps to capture and preserve at–risk digital content that is vital to our
European´s history.
The results of the project were presented at the 6th International Web
Archiving Workshop held in Alicante, Spain, 21–22 September 2006 as
part of the 10th European Conference on Research and Advanced Technology for
Digital Libraries http://www.iwaw.net/06/PDF/iwaw06-zabicka.pdf
(see Annex 3).
The results of the selection policy analysis and of the IT activities are
described in details in publication prepared as an output of this project to
present the project results. This publication is enclosed to this project
report (see Annex 4). Besides, all materials gathered up by team partners, the
drafts of recommendations for Selection Criteria, information on harvesting and
archive analysis as well as the presentation of the project at the IWAW
2006 workshop are available at the project website http://www.webarchiv.cz/culture-2000/
under the Documents section (see Annex 2).
Problems and solutions
During the project a few difficulties occured. Some of them were easy to
cope with, other required more time to sort out.
Personnel situation
At the last moment the fact that Slovenian and Estonian team have only one
person instead of planned two persons that could participate in most of
activities (especially meetings) came out. The reason was participation of
these two countries in some other European projects.
Financial situation
The financial difficulties were connected with personnel situation. Money
amounts were planned for two persons from each country in the budget. There
were also planned higher amounts of money for subsistence allowance or hotel
expences than it was needed to spend.
The beneficiary planned the budget on the basis of the official schedule
– „The daily allowance rates accepted by the Commission for staff on
mission“ – published on Culture2000 website but in fact in each of the
involved countries these rates are far smaller than the sum given in the
budget.
The beneficiary took advantage of the possibility to adjust the estimated
budget by transfers between items of eligible costs. More money was spent for
expenditures within the chapters 4 and 6, less money was used in chapters
2 and 5. The transfer did not exceed 20% of the amount of each item of
estimated eligible costs.
Information situation
First two activities of the project and the main objective – Selection
guideline were based on acquiring information from public available resources.
Nevertheless, this showed as insufficient. Because of this 3 important
international conferences were chosen to visit. They were great opportunity to
get essential information and to discuss some questions of the project with
international experts.
Organizational situation
Some organizational issues occured which needed to be cleared up. That is
why Estonian team came to Prague in November 2005 and Slovak team came to
visit INFORUM conference in Prague (May 2006)) connected with the Czech
partners discussions
Technical situation
At the beggining of the project, the partners institutions had totally
different technical and technological facilities progress. The Czech team was
the most experienced one and it assisted the partners (especially Slovak team)
to identify their national webspace and also help them to set the technical
parameters. For this reason Slovak team came to Brno (Masaryk University) in
April to learn more about hardware and software tools.
Main objectives achieved in the project:
• link institutions from four countries and start an exchange of
experience (meetings, online discussions)
• enable contacts with experts from other countries (conference, online
discussions)
• provide a list of selection criteria (selection guideline) respecting
European requirements based on performed analysis
• provide adjusted software tools
• provide best practice report to enable the cultural heritage
institutions (particularly national libraries) to help in carrying out their
tasks of preserving digital cultural heritage
List of Annexes:
ANNEX 1: Collaboration between partners (.pdf)
ANNEX 2: Project
website (.pdf)
ANNEX 3: IWAW presentation (.ppt)
Publication presenting the project:
Can be ordered at National Library of the Czech Republic (webarchiv@nkp.cz).

