Monthly Archive for July, 2005

Page 2 of 4

Efficient and Effective Clustering Methods for Spatial Data Mining

R. T. Ng and J. Han. Efficient and effective clustering methods for spatial data mining. In Proceedings of the 1994 International Conference Very Large Data Bases (Sept. 12-15, Santiago, Chile), pages 144–155. Morgan Kaufmann, San Francisco, CA, 1994. [url]

——————–

This article describes the CLARANS algorithm that is used to cluster spatial databases and that is based on randomised search. Problems of most of the methods used so far is that they require an a priori knowledge to be initialised. Their approach, on the contrary starts from scratch.

The method was developed from the CLARA algorithm, developed by Kaufmann et. al. The authors expand from this adding suppost for randomised search. Initially their method is able to find the best k_nat, which is the most natural number of clusters. Subsequently, the mothod start assigning objects to the clusters and finding the medoids.

Two versions of the method, respectively for Spatial and Non-Spatial-Dominant are tested against CLARA in a real estate data set. The results confirm the efficacy of the new method.

Tags: ,

STAMPS description

I update the description of the STAMPS project, which reflects the new research direction of my thesis proposal that besides was approved today. Yipee!

Now I *just* need to write the thesis.

————-

M. Cherubini. Collaborative annotations of space in a mobile context: a computational model that integrates spatial information and communication. Thesis Proposal 2, CRAFT, Ecole Polytechnique Fédérale de Lausanne, EPFL, Ecublens, Station 1, CH-1015, Switzerland, 2005. [url]

————–

Mining Knowledge in Geographical Data

K. Kopersky, J. Han, and J. Adhikary. Mining knowledge in geographical data. Communications of the ACM, pages 1–8, 1998. [url]

———————–

Data mining represent the confluence of several research fields, as machine learning, database systems, data visualization, statistics, and information theory.

The authors define the spatial data mining as the extraction of implicit knowledge, spatial relationships, or other patterns not explicitly stored in spatial databases. These data distinguish from the relational databases because they carry topological information, usually organised by a multidimensional spatial indexing structures.

The authors introduce several methods for knowledge mining in geographical data: the first is the generalization-based mining, the attempt to generalise abstract data from a low concept level. This method can be implemented with two phylosopies, a spatial-data-dominant generalization and a non-spatial-dominant generalization, which differ on the importance that is given to the spatial dimension of the data.

Another methodology introduced is the clustering, or densely populated regions, according to some distance measurement, in a large multi-dimensional data set.

A third methodology is the exploration of spatial associations, the rules that associate one or more spatial objects with other spatial objects. Some threshold can be implemented to control the filtering out of associations of objects.

Tags:

Tagsurf

Tagsurf is a new type of online message board which uses tags to help organize subjects instead of threads or channels. Tagsurf uses tags to help organize posts and messages between users. You can sign up for various alerts based on tags, so you can be notified of new messages instantly across a variety of mediums: IM, Chat and Email. Tagsurf is a hyper-forum: It allows you to communicate on a variety of levels, yet pulling only the messages that you feel are important.

Tags:

The London Zoo

It seems that the IVREA team is working on an interesting project with Augmented Animals. They wonder what would happen in a dynamic city as London, if a portion of the city was shaped around living animals. What kind of behaviors would it evoke (among humans, among animals)? A metropolis full of animals, an interesting combination …

They team is already in london, were student are asked to build their own proposition for augmented animals and the city. Here is the description of the workshop.

Elefanti

Tags: ,

(0)

CAIF Group picture

Digging in my HD I found this group picture of the CAIF participants.

Grp

(0)

Newsgroup Exploration with WEBSOM Method and Browsing Interface

T. Honkela, S. Kaski, K. Lagus, and T. Kohonen. Newsgroup exploration with websom method and browsing interface. Technical Report A32, Helsinki University of Technology, Laboratory of Computer and Information Science, Rakentajanaukio 2 C, SF-02150 Espoo, Finland, 1996. [url]

———————-

This paper present the application of the SOM method to the exploration of a large number of usenet messages. This method, introduced by Kohonen, in 1982 is a means for automatically arranging high-dimensional statistical data so that alike inputs are in general mapped close to each other.

The general concept is that words are first organised into categories on a word category map, then an encoding of the documents can be achieved that explicity expresses the similarity of the word meanings.

This word category map os a self-organising semantic map, in the definition of Ritter and Kohonen, 1989, that describes relations of words based on their averaged short contexts. The SOM is a supervised and calibrated method. The document map is then formed with the SOM algorithm using the histograms as fingerprints of the documents.

Websom Architecture

Word Category Map

Copyright notice: the present content was taken from the following URL, the copyrights are reserved by the respective author/s.

Tags: , ,

Tagzania

Tagzania is about tags and places. You can add places, points, to create and document your maps. When you add a point, you may tag it with keywords. That way, Tagzania is not only a place to build and keep your own maps, shared territories are created as well.

This reminded me of a similar Italian project: Mappe Aperte.

Mappeaperte

Tags: ,

delicious tag clusterer

The del.icio.us tag clusterer. This little tool clusters your del.icio.us resources on the basis of your related tags. You have to provide your del.icio.us username and password to use this service.

The clustering performs a k-means clustering algorithm. We use the Orange library with clustering extensions for this.

Bbobb Dtc Tn

Tags: ,

websom

Websom, which means WEB Self-Organizing Maps, is an algorithm that order an information space. Similar documents are set near each other on the map. The order helps in finding related documents once any interesting document is found.

1 Cx7

Tags: , ,

topten.ch

Topten is a website sponsored by the WWF, which contains a list of available products that are considered compatible with the pollution of the environment. For each product is listed the price, the lifetime, and an energetic classification.

Tags: , ,

Google Moon

Yepaa: Google Moon

Google Moon

Copyright notice: the present content was taken from the following URL, the copyrights are reserved by the respective author/s.

Tags:

Bagdad’s museum before the war

mms://mediaserver.kataweb.it/repubblica/spett_e_cult/2005/museo01.wmv

Este 18161019 29470

Unsupervised Clustering of Context Data and Learning User requirements for a Mobile Device

J. A. Flanagan. Unsupervised clustering of context data and learning user requirements for a mobile device. In A. Day, B. Kokinov, D. Leake, and R. Turner, editors, Modelling and Using Context: 5th International and Interdisciplinary Conference CONTEXT 2005, Paris, France, July, Proceedings, volume 3554 of LNAI, pages 155–168. Springer-Verlag, Berlin Heidelberg, 2005. [url]

——————————–

This paper present a technique for unsupervised learning algorithm for inferring important locations to be associated to a posssible custimization of the user interface of an application. The author departes from previous studies of inferring context extending the technique in such a way that would be possible to use the method on-the-fly and without the training period, which is impeding the development of such techniques on mobile devices.

The algorithm is called K-SCM and it is based on the idea to fuse several input sources into a string. The string is then associated to a matrix for which is calculated the variation and probability of presentation of a certain node at a certain time. Using solutions from the neural networks theory, the algorithm is able to select the winning node over a certain period of time that will define the winning location to be associated with the user interaction on the phone.

Tags:

citizen power

Recently I registered an increasing interest on the subject of citizen journalism: the Globe and Mail, The Salt Lake Tribune, The Guardian and Columbia Journalism Review (among others). I “strongly” believe in this new emergence of information not driven by corporate, which I think will modify our fruition of media and TV in general.

More pasta: One of my colleague worked on this subject writing different articles on developing public expression and public opinions. Once I also looked at this possible future scenario on the crush of citizen journalism and mainstream information.

Tags: ,