This is just a short note with which I want to summarize some interesting facts about Italian economics that are quite shocking to me, as a native Italian. This post was prompted by two amazing articles about the Italian economical situation: the first, is a report of The Economist on Italy, titled in Italian “Addio, Dolce Vita” (26th November, 2005). The second is an article on Time titled “Twilight in Italy” (5th of December, 2005). We can say that the two articles describe the Italian from a macro view to a micro focus.
1st fact: Italy weak point is the huge number of SMEs that compose almost the totality of the economic system.
For ages the internal political campaign was the motto: “small is beautiful!” And at some point this was also working (cfr. the gold age of ‘il sorpasso’ of 1987). But nowadays that the european barriers are down, most of the SMEs have not the scale, the founding, or the commercial know-how to become global players. What these small companies produce is beautiful but not so technological sophisticated that can prevent an industrial replication. Therefore, the economic structure of Italy is almost perfectly shaped for an attach by China.
2nd fact: the Italian share market is too small for the dimension of the national economy.
The number of companies that have shares on the market is less that 300. This reflects on how the companies understand the current trends of globalization and international markets. It is clear that to be competitive in this new era, a company has to grow to a certain size that allows exports in every part of the world and to be able to produce the products away from Italy (where the production costs are prohibitive).
3rd fact: entering in the euro zone was a complete disaster for the Italian economy.
It is clear that in the past with the devaluation of the currency we could stay competitive against the neighbors. Now that we cannot do the trick any more we are suffering. It is also clear to me that if we were to continue like that for the next 10 years, we could have found ourselves in the same conditions of Argentina.
Tags: Italy, politics
HyperSuper is an ‘intelligent’ news aggregator. There is no clear explanation on the site of the inner clustering mechanisms. However, Mikkom, the author, wrote me that he is going to use CNG (Context Network Graph) to cluster the news. Supa Cool!
Tags: clustering, google, information retrieval, spreading activation, text data mining
Published on
1/25/2006 in
Diary.
This is a simple list-like visualization that present the user a map of related keywords for one given keyword. On the site there is no description of how the map is constructed.

(via)
Tags: google, information visualization, spatial clustering
They Rule aims to provide a glimpse of some of the relationships of the US ruling class. It takes as its focus the boards of some of the most powerful U.S. companies, which share many of the same directors. Some individuals sit on 5, 6 or 7 of the top 500 companies. It allows users to browse through these interlocking directories and run searches on the boards and companies. A user can save a map of connections complete with their annotations and email links to these maps to others. They Rule is a starting point for research about these powerful individuals and corporations.
We should have something like this for Italian politics. It would be fun to see everything around Mr. B.
Tags: information visualization, Italy, learning technology, maps, politics, social network analysis
Published on
1/22/2006 in
Diary.
MontyLingua is a free, commonsense-enriched, end-to-end natural language understander for English. Feed raw English text into MontyLingua, and the output will be a semantic interpretation of that text. Perfect for information retrieval and extraction, request processing, and question answering. From English sentences, it extracts subject/verb/object tuples, extracts adjectives, noun phrases and verb phrases, and extracts people’s names, places, events, dates and times, and other semantic information.
Tags: Python, tagging, text data mining
E. Morse and M. Lewis. Testing visual information retrieval methodologies case study: Comparative analysis of textual, icon, graphical, and “spring” displays. Journal of the American Society for Information Science and Technology, 53(1):28–40, 2002. [pdf]
————————-
Although many different visual information retrieval systems have been proposed, few have been tested, and where testing has been performed, results were often inconclusive. Further, there is very little evidence of benchmarking systems against a common standard. An approach for testing novel interfaces is proposed that uses bottom-up, stepwise testing to allow evaluation of a visualization, itself, rather than restricting evaluation to the system instantiating it. This approach not only makes it easier to control variables, but the tests are also easier to perform. The methodology will be presented through a case study, where a new visualization technique is compared to more traditional ways of presenting data.
Tags: clustering, graphical user interface, information retrieval, information visualization
Published on
1/19/2006 in
Diary.
These days I am pretty busy working with Lorenzo on our super secret project on Context Network Graphs. On our work schedule we had a delay due to the fact that we were trying to find a decent way to show a document collection on a two-dimensional map. We started with an ordered list of documents with ranking values.
From this one-dimensional situation we had to develop a second dimension of information and I can now swear that was not easy. We choose to use triangulation and the biggest problem we fought was that some triangle did not close properly. This document demonstrate how to compute if three documents can be placed in a triangle.
To verify that I did a quick hack in Python that was showing some gaps in the circles formed between each couple of points (see picture below). This was fun. To find how to fix this was not fun at all. But finally …
Tags: clustering, hack, information retrieval, map algorithms, maps, Python
meX-Search is a meta search engine that automatically categorizes search results into thematic groups and displays them by intuitive and interactive maps.
meX-search is an experimental, non commercial meta search engine built up from april to july 2004 by Karsten Knorr during his diploma thesis in computer and media science [University of Applied Science Berlin]. The main idea of the thesis was the implementation of an intuitive and simple user interface for web clustering search engines.
Users of conventional Web search engines are often forced to sift through a long list of off-topic documents to find relevant results… Especially when the search query is general, it is often hard to find relevant resources among thousands of irrelevant ones. Search result clustering is a approach to handle such problems by grouping similar documents among search results into thematic groups.
meX is a meta search engine. Currently meX is getting the search results completely from the Yahoo-API.
The clustering of the result-snippets from Yahoo is based on Carrot2, an open source java framework for clustering textual data. Within the Carrot2 framework meX uses the Lingo Algorithm. The Authors of the Carrot2 framework and components: Dawid Weiss, Jerzy Stefanowski, Stanislaw Osinski.
Tags: clustering, google, graphical user interface, information retrieval, information visualization, maps, search engine
Carrot2 is a research framework for experimenting with automated querying of various data sources (such as search engines), processing search results and their visualization.
Under the term “research”, we understand that the architecture of the system is oriented mostly toward flexibility, sometimes at a price of performance losses. Mechanisms such as data exchange via XML language, dynamically loaded components accessible via HTTP protocol, the use of Java as primary language of implementation — they all make the system very easy to tailor to one’s needs. Carrot2 was primarily built with search results clustering in mind, but it can be easily configured to do other, interesting things.
Tags: clustering, information retrieval, machine learning, search engine
I found this great portal of the World Watch institute, which is an independent research organization that works for an environmentally sustainable and socially just society by providing compelling, accessible, and fact-based analysis of critical global issues. The portal offers the access to a variety of publications of synthesis of research on environmental facts. Most of the publications are accessible with a small payment to sustain the activity of the institute. I think is a small price for the quality of the information they provide.
Browsing the site I found this article on the coal consumption projections for year 2010:
The rapid growth in coal use in China and India, where pollution controls are minimal, is adding to local and long-distance pollution. More than 80 percent of Chinese cities in a recent World Bank survey had sulfur dioxide or nitrogen dioxide emissions above the World Health Organization’s threshold.
Scientists have concluded that growing up in a city with polluted air is about as harmful to a person’s health as growing up with a parent who smokes. Although air pollution is concentrated in cities, it can move well beyond them: for example, acidic lakes in Scandinavia have been linked to pollution from factories in the United States. The World Bank projected that on average 1.8 million people would die prematurely each year between 2001 and 2020 because of air pollution.
Tags: ecology, environment, politics, society, statistics
Published on
1/12/2006 in
Diary.
These days Jean-Baptiste is working on the prototype of the Noise Sensitive Table. The idea is that this desk should react to the users voice offering a feedback on their turn-taking and collaborative processes. Here are some shoots I took in preview.

Tags: graphical user interface, human computer interaction, information visualization, interactive furniture, ubiquitous computing, tangible interface, usability
Il grado di civiltà di un popolo si misura dal modo in cui tratta gli animali.
Mohandas K. Gandhi
Combine is an open system for crawling [harvesting and threshing (indexing)] Internet resources. The name is derived from the combine-harvester since the two perform their jobs in a similar way.
The Combine was initially developed as a part of the Development of a European Service for Information on Research and Education (DESIRE) project, which was funded by the European Commission within Telematics for Science Program.
It is later beeing modified for focused crawling by integrating the automated topic classification algorithms also developed in DESIRE with the crawler. This work is funded by Vinnova, Swedish Agency for Innovation Systems (project P22504-1 A) and the EU project ALVIS project.
Tags: google, information retrieval, open source, search engine
retrievr is an experimental service which lets you search and explore in a selection of Flickr images by drawing a rough sketch.
retrievr doesn’t do object/face/text recognition of any kind, so if you’re drawing an outline sketch of a chair, it almost certainly won’t get you one back (except your index only contains images of chairs). The same holds for corporate logos, icons &c.
It helps to think of it as matching the most pronounced shapes and slabs of colors.
Another thing to know is that there’s currently no way to specify the aspect ratio, so you have to rescale the image in your head (things that are close to the borders of the image you’re thinking of should be close to the borders of your sketches), but that’s really more of a missing feature of the drawing flashlet than an inherent problem. Sometimes it also helps to remove detail instead of adding it.
Personally, I see retrievr more as an “exploration” tool than as a “search” tool, and it seems to work very well for that.
Tags: google, information retrieval, machine learning, Python, search engine
The project, with partners here at EPFL, conducts research in the design, use and interoperability of topic-specific search engines with the goal of developing an open source prototype of a distributed, semantic-based search engine (the architecture is reported in the picture below).
Existing search engines provide poor foundation for semantic web operations, and US companies such as Google are becoming monopolies, distorting the entire information landscape. Our approach is not the traditional Semantic Web approach with coded or semi-automatically extracted metadata, but rather an engine that can build on content through automatic analysis. Linguistic processing is inside the search engine and a probabilistic document model provides a principled evaluation of relevance to complement existing standard authority scores. This facilitates semantic retrieval and incorporates pre-existing domain ontologies using facilities for import and maintenance. The distributed design is based on exposing search objects as resources, and on using implicit and automatically generated semantics (not ontologies) to distribute queries and merge results. Because semantic expressivity and interoperability are competing goals, developing a system that is both distributed and semantic-based is the key challenge: research involves both the statistical and linguistic format of semantic internals, and determining the extent to which the semantic internals are exposed at the interface.
Tags: google, information retrieval, open source, p2p, search engine
Recent Comments