Monthly Archive for January, 2006

Page 2 of 2

The Open Directory Project

The Open Directory Project is the largest, most comprehensive human-edited directory of the Web. It is constructed and maintained by a vast, global community of volunteer editors.

The idea behind is that search engines are increasingly unable to capture the growing complexity of the web. Their solution is to let people to do this job: a community of editors (71053 to date) review hundreds of sites each day and catalogue them.

Dmoz

Tags: , ,

Corporate Human Rights Violators of 2005

According to Global Exchange (full text here), an NGO that promotes social, economical and environmental justice, corporations carry out some of the most horrific human rights abuses of modern times. In this report they focus on the worst 14th companies:

Caterpillar -> contracting with known violators of human rights, enabling house demolition, supplying equipment that kills Palestinian civilians and American peace activists

Chevron -> environmental destruction, health violations, and violent killings

Coca-cola -> violent killings, kidnap and torture, water privatization, health violations, and discriminatory practices

Dow Chemical -> creation of chemical weapons, marketing poisonous chemicals, illegal dumping of toxins into populated areas, environmental destruction, health problems, death

Dyncorp / CSC -> causing health problems, environmental devastation and death; endangering lives; physically abusing individuals; sex trafficking

Ford Motor Company -> environmental degradation, climate change, fueling wars for oil

Kellog, Brown and Root -> Overcharging and providing unnecessary services on taxpayer’s dollar, bribery, exploiting third country nationals

Lockheed Martin -> War profiteering, warmongering

Monsanto -> Displacement, health violations, and child labor

Nestle USA -> Abusive child labor, repression of worker rights, aggressive marketing of harmful products, violation of national health and environmental laws

Philip Morris USA -> aggressively marketing lethal products

Pfizer -> Killer price-gouging

Suez-Lyonnaise des Eaux -> Water privatization

Wal-Mart -> worker rights violations, labor discrimination, union busting

Tags: ,

I-Spy: a Collaborative Search engine

I-Spy is a community-based Internet meta-search engine that provides you with search results that are informed by similar users. By joining an I-Spy community and using it to search the Web you not only benefit from high-quality meta-search results, but also from the result selections of users within that community. The idea behind this is that users with similar interests are likely to find the same results interesting for similar queries.

I-Spy implements collaborative ranking, borrowing ideas from collaborative filtering. It is a meta search in the sense that builds on top of an existing web search engine (in this case google).

I-Spy

Tags: , ,

FlashLite 2.0 is here

Flash Lite 2 is based on the Flash 7 standard for content. This means that content developed in the latest Flash authoring environment can be re-purposed for mobile and consumer electronic devices. It supports loading and parsing of external XML data in Flash content using the same XML handling methods as Flash Player 7.

Flash Lite 2 supports the ability to locally store and retrieve relevant, application-specific information such as preferences, high scores, usernames, etc. This provides a much more robust development environment.

Flash Lite 2 enables dynamic loading of multimedia content such as images, sound and video, based on supported codecs available on the device. This includes loading and handling XML data and SWF content. Flash Lite 2 also provides video support and external multimedia support. This includes in place video as well as image loading (gif jpeg, png w/ transparency) and audio loading.

Flash Lite 2 enables developers to easily create sophisticated vector graphics and animated shapes, at runtime, using ActionScript 2.0.

Flash Lite2 InterfaceFig01 Sm

Tags:

MSN History Visualization

Messing around on the web 1.0 ;-) I found this nice visualization tool that I like. I guess it can even be used for some particular research purpose, maybe in the Mutual Modeling project. The application reads the xml files that are being stored by the MSN, and makes a graphical display that allows to make comparisons between conversations with different people and tries to answer to the following questions:



  • how many words do I use in each utterance?
  • which are the words that I use the most?

Msn History

Tags: , ,

Techniques for determining the location on UMTS networks

Lorys L. Pognon  wrote a white paper on the techniques to determine location on UMTS networks. The paper answer questions such as can we get location information on UMTS networks the same way we get location on Cell-ID over GSM networks.

LBS (Location-Based Services) is a recent concept that denotes applications integrating geographic location (spatial coordinates). One of the important aspect of LBS is the location of the mobile user. Depending on the network, the location techniques are differents. This paper gives a brief detail of some existing technologies that could be used for mobile user localisation.

Download here the [pdf]. (via)

Tags: , , ,

Iskodor: a Framework for Congenial Web Searching

ISKODOR is an experimental search system developed a the University of Bonn, which goal is the implementation of the ‘congenial web search’, a user-centered approach where search quality is constantly evaluated through explicit feedback.

ISKODOR implements personalized ranking matrices; collaborative information retrieval in the form of peer groups, which are used to limit the scope of a search.

The Web provides a global platform for knowledge sharing. However, several shortcomings still arise from the absence of personalization and collaboration in Web searches. More effective retrieval techniques could be provided by means of transforming explicit knowledge into implicit knowledge. Iskodor is based on a peer-to-peer architecture and aims at complementing classical Web searches in terms of personalized ranking lists. These local rankings can be accumulated and evaluated in order to supplement the process of knowledge generation by building Virtual Knowledge Communities. Furthermore, the aggregation of ranking lists can be used to identify topics as well as communities of interest. Together with social aspects for community support, a framework for congenial Web search is defined.

Mysearch

Tags: , ,

Social Information Retrieval

S. M. Kirsch. Social information retrieval. Diploma thesis in computer science, Rheinische Friedrich-Wihelms-Universität Bonn, Institut für Informatik III, Bonn, Germany, 22nd of November 2005.

———————

The goal of this thesis’ work is the combination of well established retrieval methodologies with the most recent social network analysis. The opening claim is that a modern information retrieval system should determine the exact nature of the user’s information needs. This can be achieved looking at information that comes from immediate contacts that is usually preferred to that that comes from anonymous sources.

Current search engines, according to the author, are susceptible of a form of tyranny of the majority: they can only display those sites that will be relevant to the majority of its users, but not to the actual users who submitted the query. Two viable solution are identified on the literature and studied in deept: personalization of search and the addition of collaborative elements.

This thesis therefore defines the social information retrieval task and describe its domain. A formalization on the basis of associative networks is provided, as well as search procedures for these networks. An evaluation compares the described methods to conventional information retrieval methods.

Tags: , , ,

Dissociating Semantic and Associative Word Relationships Using High-Dimensional Smeantic Space

K. Lund, C. Burgess, and C. Audet. Dissociating semantic and associative word relationships using high-dimensional smeantic space. In Proceeding of the Cognitive Science Society, pages 603–608, Hillsdale, N.J., USA, 1996. Erlbaum Press. [pdf]

————————

The paper studies the lexical/semantic priming effect which is questioned to be associative in nature. The aim of the paper is to shed some light on the question for which two crucial point are tackled: firstly, an operational definition of of semantic and association is needed; secondly, the definition of a framework for modelling semantic representations.

Their proposition for the first point is that semantically related words (TABLE – BED) are instances of the same category and share a number of features. Associated words (MOLD – BREAD) are those which are associated as determined by human word association norms. There is also a third type that are both semantically and associatively related (UNCLE – AUNT).

To solve the second point they propose a framework, called HAL (Hyperspace Analogue to Language) that allow to simulate different experiments. The methodology is based on the computation of a matrix of co-occurrence vectors for each word, which can be analyzed for semantic content. The co-occurrence is defined using the window-size parameter (co-occurrence within n words). Than a similarity is computed between the vectors using an Eucledian distance measure.

Using a certain dataset, the author simulated a certain kind of association between word pairs. They they repeated the experiment with human subject and confronted the results. The conclusion was that the notion of associativity can be characterized by temporal association in language receive little or no support from their corpus analysis. Word association seeed to be more a function of semantic neighborhood.

Another interesting result was that the distinction between associative and semantic information corresponds to the distinction between local co-occurrence and global co-occurrence. Temporal information is reflected in local co-occurrence. Global pattern of co-occurrence across a vocabulary is connected to semantic information.

Tags:

MySync: a Mac to Mac syncing utility without a .Mac

I found this nice application that allows to syncronise bookmarks, calendars, contacts, keychain, and mail between two or more Mac without buying an annoying .Mac account. Once MySync is installed on each computer to be synced, one copy of MySync needs to be configured as a “Master” node. The remaining copies should be configured as “Slave” nodes.

Once configured, the MySync nodes will automatically discover each other using Bonjour.

You will also need to enable syncing for the datatypes that you want to sync, and to run your first Sync manually; Apple’s Sync Framework will present a dialog on the first sync, asking you to confirm that MySync should be allowed to sync the specified data types. After your first sync, you can configure each MySync Slave node to sync automatically on a quarter hourly, hourly, daily, or weekly basis, or leave syncing to be activated manually.

Mysyncwindow

Tags: ,

Komodo: a commercial IDE for Python on OS X

Komodo seems to offer support for the native version of Python installed on OS X. Lots of nice features and the pricing for personal students use seems pretty decent. However there is no SVN support on the student version.

Ss Komodo Rails Large

Tags:


Continue reading ‘Komodo: a commercial IDE for Python on OS X’

Design of Spatial Applications Workshop @ MIT Media Lab

This workshop will focus on learning about the tools available to build these new kinds of spatial applications, and reflect on thoughts from philosophy, urban planning, computer science, design, and geographic information systems. Students in the workshop will be expected to attend lectures and framing sessions where we will go over tools, technologies, and techniques for building spatial applications. There will be friday critique sessions where students will have the opportunity to share their progress, ask questions, and receive feedback.

The workshop will culminate in a final design competition where the most provocative, interesting, and successful projects will receive recognition and where the “best of these” as determined by a panel of judges will be considered for the round and flat awards.

Additional Information:

spatialworkshop@media.mit.edu

An Introduction to Random Indexing

M. Sahlgren. An introduction to random indexing. In Proceeding of the Workshop at the 7th International Conference on Terminology and Knowledge Engineering, conference Methods and Applications of Semantic Indexing, Copenhagen, Denmark, 16th August 2005. [pdf]

———————–

This paper presents the Random Indexing algoritm that is introduced as a good alternative to LSI and similar word space methods that are based on the distributional hypothesis, which states that words with similar meaning tend to occur in similar contexts.

The author states the limit of word space models, which is the efficiency and the scalability problem: the co-occurrence matrix will become soon computationally intractable when the vocabulary grows. Additionally, the author highlight the fact that a majority of the cells will be zero: a tiny amount of the words in language are distributionally promiscuous; the vast majority of words only occur in a very limited set of contexts.

Available methods, like LSI  uses for this purpose a matrix truncation to reduce the dimensionality that according to the author should be avoided for three reasons: 1) the reduction is computationally costly; 2) the reduction is one-time operation that needs to be redone each time a new dimension is added; 3) the reduction requires initial sampling of the data that is often done with ad-hoc solutions.

Random Indexing goes on the contrary of traditional views (i.e., first construct co-occurrence matrix, then extract context vectors) moving from the accumulation of context vecotrs to the construction of the co-occurrence matrix.

Tags: , ,

Happy New Year 2006!: some considerations about blogging and summary of 2005

This is the first post of the new year, which both should summarize the ended year 2005 and set the resolutions for the coming time. First of all I should say that 2005 was a tremendous year for blogging. The thing just boomed see this personal stat:

  • 2002 -> 7 posts
  • 2003 -> 116 posts
  • 2004 -> 333 posts (almost one per day)
  • 2005 -> 539 posts (almost two per day)

Using this tool I achieved two great objectives: first, to have a log of the evolution of my research subject and conclusion; second, the construction of the network of people working on the same subject (I have got 119 comments so far, out of which a dozen were real and good research contacts). A third objective was that of communicating my updated research status to my co-workers that regularly check my blog, and finally the good point was to push me to be less lazy.

I can state with no doubts that blogging and aggregation are now a fundamental tool for research and learning in general. This is obviously a revolution in the internet revolution: the shift of control on content production. Internet 1.0 was a huge showcase. Internet 2.0 is a constantly evolving people-based environment, where ‘fruitors’ are in control of the content.

Some great scenarios are just one simple step ahead: the redefinition of streamline media power of information into a citizen-driven domain. For the new year I would love to see less posts but with better and more original content. What I would love is to reblog less and talk more in detail of my own points of view. May the force be with us in this 2006! Happy blogging!!

Tags: , , ,