Friday, November 14, 2008

reading notes: week 11

-Mischo: Digital Libraries: This article is about how some digital libraries went from being libraries that offered some digital resources to libraries that had full digital services, through the DLI, or digital library initiative. Some programs were federally supported when they began to digitize, including six major universities. Many of the new technologies created by these digital projects are still used and evolving today. Some of these also considered federating, or sharing their digital materials.
-Paepcke et al: Dewey meets the Turing: this article is also about the digital library initiative, and starts out by stating that it was a joining of librations and computer scientists. Computer scientists wanted to make research easier and make libraries closer to home. Librarians wanted to make sure they continued to be a part of scholarly work, and the initiative allowed the funds and the means to achieve this. It was important in the end that while many of the libraries technologies changed, the core values stayed the same. This was an important part of the article to me, it showed how a field could grow and improve with technology, but still hold on to what made the field important.
-Lynch: Institutional Repositories: This article is about digital repositories of information, and how these have changed the way scholars share information and communicate. Many of these are held in universities, and were created to hold information in digital form for scholars. This article talked about the institutional pressures and politics behind whether or not to have the repositories, and the futre of what these might bring to the scholarly world.

Muddiest Point- week 10

What is the difference between controlled vocabulary and thesaurus vocabulary when you are making an index? Are these the same thing, and how do they relate?

Wednesday, November 12, 2008

Wednesday, November 5, 2008

Reading Notes- week 10

-Search Engines, Part 1, David Hawking: This site was easy to understand and it gave a basic overview of what web-searching is, and how it is done. It starts by explaining how data centers are clusters of computers; the computers have to be clustered together because there is too much information to search through for only one computer alone. These computers are called crawlers; they crawl to gather the information. They can check for blocked pages, duplicated pages, or spam pages (pages that use one or more false keywords to gain more popularity).
-Part 2: Part two goes on where part one leaves off. It explains that there is such a large scale of documents and words to search; they scale these down to search more than one at once. There are more searchable terms than in existence than words in the English language, because web-searchers search for words in many languages, both real and made up words or acronyms. Phrases can be searched for, but they are often subdivided to allow for the results to come up faster. Web searching tools can rate pages by the number of links that lead to a certain page. Pages with a lot of links going to them are more popular.
-Shreeves, S. L., Habing, T. O., Hagedorn, K., & Young, J. A., Current Developments:
This article is about a protocol by the OAI, open archives initiative, to harvest metadata. The initiate was started two years before this article was written, and this article is a response to how the metadata collection has improved and progressed. It also comments on some future work the initiative would like to complete to advance the collecting they are now doing
-Bergman: This page by Bergman was about the deep web. I found this interesting to read about, because it explained that web searching tools are often only searching the surface of the web. The deep web consists of about 500 times the amount of information that is usually brought up through normal web searching. The deep web has information that might be usable and important, but it is not often seen. This article also made an effort to break down what types of information (such as news) is being lost in the deep-web. This article was interesting, because it introduced me to some information I did not know about before, I did not know that most search engines do not search the deep web.

muddiest point-week 9

What parts of SGML were changed to make it into XML, and why did this make it so much better? XML is easier to use, but in what ways?