Main page - Introduction to Genetic Algorithms - Tutorial with Interactive Java Applets
http://www.obitko.com/tutorials/genetic-algorithms/index.php
e area of genetic algorithms is very wide, it is not possible to cover everything in these pages. But you should get some idea, what the genetic algorithms are and what they could be useful for. Do not expect any sophisticated mathematicNeural Networks - A Systematic Introduction
Interesting looking ebook. Give it a look.MIT’s Introduction to Algorithms, Lectures 17, 18 and 19: Shortest Path Algorithms - good coders code, great reuse
Pode ser útil para a situação de detectar redirects ciclicos de páginas web!
many method of finding loop in singled list with codeWelcome to Pyevolve documentation ! — Pyevolve v0.5 documentation
Pyevolve was developed to be a complete genetic algorithm framework written in pure python.How Not To Sort By Average Rating
I always like it when people call out other people for bad math....
You are a web programmer. You have users. Your users rate stuff on your site. You want to put the highest-rated stuff at the top and lowest-rated at the bottom. You need some sort of "score" to sort by.新はてなブックマークでも使われてるComplement Naive Bayesを解説するよ - 射撃しつつ前転
新はてなブックマークではブックマークエントリをカテゴリへと自動で分類しているが、このカテゴリ分類に使われているアルゴリズムはComplement Naive Bayesらしい。今日はこのアルゴリズムについて紹介してみる。
Complement Naive BayesCode: Flickr Developer Blog » Found in space
Developer Blog
auf dieser flickr-group werden deine himmelsfotografien nach den abgebildeten sternbildern aufgelöst. kewl:)
The “blind astrometry server” is a program which monitors the Astrometry group on Flickr, looking for new photos of the night sky. It then analyzes each photo, and from the unique star positions shown it figures out what part of the sky was photographed and what interesting planets, galaxies or nebulae are contained within. Not only does the photographer get a high-quality description of what’s in their photo, but the main Astrometry.net project gets a new image to add to its storehouse of knowledge.
cool!
Crowd-sourced sky cataloguing.Directed Edge News » Blog Archive » On Building a Stupidly Fast Graph Database
on-building-a-stupidly-fast-graph-database
connected to and things that connect to them. These are symmetrical — so creating a link from item A to item B, creates a reference from item B to item A.Jonathan Ellis's Programming Blog - Spyced: All you ever wanted to know about writing bloom filters
http://www.cs.umd.edu/~gasarch/BLOGPAPERS/egg.pdf
Annotated link http://www.diigo.com/bookmark/http%3A%2F%2F20bits.com%2Farticles%2Finterview-questions-two-bowling-ballsAdvanced Computer Science Courses at Paper Trail
Below I've collected some links to advanced computer science courses on-line. I'm concentrating on courses with good lecture notes, rather than video lectures, and I'm applying a rather arbitrary filter for quality (otherwise this becomes a directory with less semantic utility). This is the good stuff! But only a subset of it - any recommendations for good courses are gratefully received. I'm mainly interested in systems, data-structures and mathematics, so reserve the right to choose topics at will.
Open courseware from various sources. High quality too.Longest common subsequence
Starting with a list of runners ordered by finishing time, select a sublist of runners who are getting younger. What is the longest such sublist?
Longest common subsequence
Taking a brief step back, this article is the third of a series. In the first episode we posed a puzzle: Starting with a list of runners ordered by finishing time, select a sublist of runners who are getting younger. What is the longest such sublist? In the second episode we coded up a brute force solution which searched all possible sublists to find an optimal solution. Although the code was simple and succinct, its exponential complexity made it unsuitable for practical use. In this episode we’ll discuss an elegant algorithm which solves our particular problem as a special case. On the way we’ll visit dynamic programming, Python decorators, version control and genetics.MIT’s Introduction to Algorithms, Lectures 20 and 21: Parallel Algorithms - good coders code, great reuse
Lectures
This is the thirteenth post in an article series about MIT’s lecture course “Introduction to Algorithms.” In this post I will review lectures twenty and twenty-one on parallel algorithms. These lectures cover the basics of multithreaded programming and multithreaded algorithms.Easy AI with Python (#115) - PyCon 2009 - Chicago - A Conference for the Python Community
Survey several basic AI techniques implemented with short, open-source Python code recipes. Appropriate for educators and programmers who want to experiment with AI and apply the recipes to their own problem domains. For each technique, learn the basic operating principle, discuss an approach using Python, and review a worked out-example. We'll cover database mining using neural nets, automated categorization with a naive Bayesian classifier, solving popular puzzles with depth-first and breath-first searches, solving more complex puzzles with constraint propagation, and playing a popular game using a probing search strategy.
Probably the most beautiful code I have ever seen. Lovely algorithms in elegant style. "Survey several basic AI techniques implemented with short, open-source Python code recipes. Appropriate for educators and programmers who want to experiment with AI and apply the recipes to their own problem domains. For each technique, learn the basic operating principle, discuss an approach using Python, and review a worked out-example. We'll cover database mining using neural nets, automated categorization with a naive Bayesian classifier, solving popular puzzles with depth-first and breath-first searches, solving more complex puzzles with constraint propagation, and playing a popular game using a probing search strategy."
Some AI examples made in Python. Discusses the AI techniques and the code.Singular Value Decomposition
."Sorting Algorithm Animations
via the cairo graphics lib, see cairographics.org
Cool, visual, way of showing sorting algoriothmsw
Static images of sorting algorithms, pretty neat!
"This whole thing started partly as an excuse to get familiar with the Cairo graphics library. It produces beautiful, clean images, and appears to be both portable and well designed. It also comes with a set of Python bindings that are maintained as part of the project itself - a big plus in my books. Firefox 3 will use Cairo as its standard rendering back end, which will instantly make it one of the most widely used vector graphics libraries out there. "
I dislike animated sorting algorithm visualisations - there's too much of an air of hocus-pocus about them. Something impressive and complicated happens on screen, but more often than not the audience is left mystified. I think their creators must also know that they have precious little explanatory value, because the better ones are sexed up with play-by-play doodles, added, one feels, as an apologetic afterthought by some particularly dorky sportscaster. Nevertheless I've been unable to find a single attempt to visualise a sorting algorithm statically (if you know of any, please drop me a line). So, presented below are the results of a pleasant evening with some nice Scotch and the third volume of Knuth. First, here's a taster - a static visualisation of heapsort: Heapsort I think these simple static visualisations are much clearer than most animated attempts - and they have the added benefit of also being, to my not entirely unbiased eye, rather beautiful.
I think these simple static visualisations are much clearer than most animated attempts - and they have the added benefit of also being, to my not entirely unbiased eye, rather beautiful.distributed systems primer :: snax
I've been reading a bunch of papers about distributed systems recently, in order to help systematize for myself the thing that we built over the last year. Many of them were originally passed to me by Toby DiPasquale. Here is an annotated list so everyone can benefit. It helps if you have some algorithms literacy, or have built a system at scale, but don't let that stop you.SMS: "Tim Gowers - Computational Complexity and Quantum Compuation"
Computational complexity lectures
Fields Medalist Tim Gowers' lectures on computational complexity.How Google produces relevant search results | News | TechRadar UK
A link to the paper about the google pagerank 20 page thesis from Stanford by Sergey Brin and Lawrence Page
How Google produces relevant search results: Indexing and PageRank explained http://bit.ly/UsMK7 [from http://twitter.com/KeithDriscoll/statuses/1918603384]Understanding Ternary Trees | PC Plus
ince we m
Ternary trees are the fastest way to search for data strings, at least in hardcore programming terms, but how exactly do they work?Yury Lifshits | Algorithmic Problems Around the Web
Big O notation is used in Computer Science to describe the performance or complexity of an algorithm. Big O specifically describes the worst-case scenario, and can be used to describe the execution time required or the space used (e.g. in memory or on disk) by an algorithm.Warping Text To Bézier curves
I want one of these! "We have created scalable infrastructure, named Pregel, to mine a wide range of graphs. In Pregel, programs are expressed as a sequence of iterations. In each iteration, a vertex can, independently of other vertices, receive messages sent to it in the previous iteration, send messages to other vertices, modify its own and its outgoing edges' states, and mutate the graph's topology (experts in parallel processing will recognize that the Bulk Synchronous Parallel Model inspired Pregel). Currently, Pregel scales to billions of vertices and edges, but this limit will keep expanding. Pregel's applicability is harder to quantify, but so far we haven't come across a type of graph or a practical graph computing problem which is not solvable with Pregel. It computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code. "
Kernel
So many things to learn and apply in business deals.
http://spinn3r.com/rankStephen Marsland
Stephen Marsland, Massey University
"I've written a textbook ... there are lots of Python code examples in the book, and the code is available here."
Machine Learning: An Algorithmic Perspective
"I've written a textbook entitled "Machine Learning: An Algorithmic Perspective". It will be published by CRC Press, part of the Taylor and Francis group, on 2nd April 2009. The book is aimed at computer science and engineering undergraduates studing machine learning and artificial intelligence. There are lots of Python code examples in the book, and the code is available here. Where special datasets are used they are provided with the code, and there are links to additional datasets at the bottom of the page."Computer Science Books Online
free computer science books online in PDF format
esoHow b-tree database indexes work and how to tell if they are efficient (100' level) | mattfleming.com
A team member thought we should add an index on a 90 million row table to improve performance. The field on which he wanted to create this index had only four possible values. To which I replied that an index on a low cardinality field wasn't really going to help anything. My boss then asked me why wouldn't it help? I sputtered around for a response but ended up telling him that I'd get back to him with a reasonable explanation.
Imported from http://twitter.com/newsycombinator/status/2645303258 How b-tree database indexes work and how to tell if they are efficient http://bit.ly/dd6mfMIT’s Introduction to Algorithms, Lectures 22 and 23: Cache Oblivious Algorithms - good coders code, great reuse
From the simple to the intricate, geometry is an inescapable part of graphics programming.iPhone Sudoku Grab: How does it all work?
How he solved a limited purpose computer vision problem by applying knowledge about the problem domain.
Weil iPhones Sudokus ganz leicht erkennen können.Plain English Explanation of Big O Notation
I've met too many developers who don't grok big OA short history of btrfs [LWN.net]
In this article, we'll take a behind-the-scenes look at the design and development of btrfs on many levels - technical, political, personal - and trace it from its origins at a workshop to its current position as Linus's root file system. Knowing the background and motivation for each step will help you understand why btrfs was started, how it works, and where it's going in the future. By the end, you should be able to hand-wave your way through a description of btrfs's on-disk format.
btrfs is a b-tree based fs that is cow friendly (i.e. by removing sibling links you don't have to copy whole tree on block update). Support snapshots, checksums etc. Implementation comes out of Oracle, has some commonalities with zfs.Feature Column from the AMS
An intuitive explanation of the geometric meaning behind SVD.
Good explanation of the SVD
Geometric interpretation of SVD.The Matrix, but with money: the world of high-speed trading - Ars Technica
The Matrix, but with money
Supercomputers pitted against one another in a high-stakes battle of attack and counterattack over a global network where predatory algorithms trawl the information stream, competing every millisecond to gain an informational advantage over rivals. It sounds like Hollywood fiction, but it's just an average trading day on the stock market.NP Contemplation: Clojure: Genetic Mona Lisa problem in 250 beautiful lines
Clojure is surrounded by hype these days. The word on the streets is that Clojure is the Next Big Thing. It has access to the largest library of code and it proposes a nice solution the to the concurrency problem. Lots more has been said... But I haven't seen a lot of code. So I set out to make a small but meaningful program in Clojure to get a sense of it's potential. I give Clojure two thumbs up, and I think you'll do too. The Mona Lisa Problem The program I present tries to paint Mona Lisa with a small number of semi-transparent colored polygons. It does so by using Darwin's theory of evolution to evolve programs that draw Mona Lisa.
The (clojure) program I present tries to paint Mona Lisa with a small number of semi-transparent colored polygons. It does so by using Darwin's theory of evolution to evolve programs that draw Mona Lisa.The Status of the P Versus NP Problem | September 2009 | Communications of the ACM
A repository for algorithms, and an environment for collaborative development.
looks like an interesting stack overflow for algorithms. i'll keep watching this one...doesn't have kelly ratio or even sharpe ratio yet.Stephen Marsland
I've written a textbook entitled "Machine Learning: An Algorithmic Perspective". It will be published by CRC Press, part of the Taylor and Francis group, on 2nd April 2009. The book is aimed at computer science and engineering undergraduates studing machine learning and artificial intelligence.
I've written a textbook entitled "Machine Learning: An Algorithmic Perspective". It will be published by CRC Press, part of the Taylor and Francis group, on 2nd April 2009. The book is aimed at computer science and engineering undergraduates studing machine learning and artificial intelligence. There are lots of Python code examples in the book, and the code is available here.
Machine Learning: An Algorithmic Perspective
Python codes from a textbook entitled "Machine Learning: An Algorithmic Perspective"
by Stephen Marsland
I've written a textbook entitled "Machine Learning: An Algorithmic Perspective". It will be published by CRC Press, part of the Taylor and Francis group, on 2nd April 2009. The book is aimed at computer science and engineering undergraduates studing machine learning and artificial intelligence. There are lots of Python code examples in the book, and the code is available here. Where special datasets are used they are provided with the code, and there are links to additional datasets at the bottom of the page.linkiblog | How to Build a Popularity Algorithm You can be Proud of
Free course/lecture notes on optimization algorithms: genetic algorithms, simulated annealing, particle swarm optimization
"What is a Metaheuristic? A common but unfortunate name for any stochastic optimization algorithm intended to be the last resort before giving up and using random or brute-force search. Such algorithms are used for problems where you don't know how to find a good solution, but if shown a candidate solution, you can give it a grade. The algorithmic family includes genetic algorithms, hill-climbing, simulated annealing, ant colony optimization, particle swarm optimization, and so on. "
This is an open set of lecture notes on metaheuristics algorithms, intended for undergraduate students, practitioners, programmers, and other non-experts. It was developed as a series of lecture notes for an undergraduate course I taught at GMU. The chapters are designed to be printable separately if necessary. As it's lecture notes, the topics are short and light on examples and theory.a bestiary of algorithmic trading strategies « Locklin on science
Quants come in three basic varieties. 1. Structurers: people who price complex financial instruments. 2. Risk managers people who manage portfolio risk 3. Quant traders people who use statistics to make money by buying and selling most quants are structurers. Of course, there is often bleed over between these varieties -but it’s a useful taxonomy for looking for work. I’ve done a little of all three at this point (very little, honestly), and have always liked quant trading problems more than the other two varieties. It’s the most ambitious, and the most likely to net you a career outside of a large organization (go me: Army of one!). It’s also the most mysterious, since successful quant traders don’t like to talk about what they do. Structurers and risk managers have to talk about what they do, almost by definition. Quant traders gain little from talking about their special sauce.
***** very good and deep articles on finance topics by "Locklin on science"
vocab of "job specs" in tradingEli Bendersky’s website » Blog Archive » Co-routines as an alternative to state machines
Observation: Co-routines are to state machines what recursion is to stacks When you have to traverse some sort of a nested data structure (say, a binary tree), one approach is to create a stack that remembers where in the tree you are. Another, much more elegant approach, is to write the function recursively. A recursive function employs the machine stack used to implicitly implement function calls - you get the benefits of the stack without paying the cost of reduced readability.Timefire: On Reducing the Size of Compressed Javascript (by up to 20%)
"...what effect does the large-scale structure of the JS output code have on the DEFLATE algorithm of GZIP which is used to serve up compressed script?" Another instance of using knowledge of the specific file type to get gains in compression. Is there a web proxy running all this at which I can point my phone?
On JavaScript minification and compression.
better compression through instruction rearrangement. this guy drives me somewhat crazy, but he does cool work.Machine learning classifier gallery
Interesting comparative performance of various algorithms on different data
A highly informative visualization of the biases of different ML classifiers. Really useful, especially for talks to non-experts.Netflix prize tribute: Recommendation algorithm in Python | This Number Crunching Life
Quick implementation of the Netflix recommendation algorithm (probablistic matrix factorization) in Python.
probabalistic matrix factorisation
I test my code using synthetic data, where I first make up latent vectors for users and items, then I generate some training set ratings by multiplying some latent user vectors by latent item vectors then adding some noise. I then discard the latent vectors and just give the model the synthetic ratings.Nihilogic : Canvas Visualizations of Sorting Algorithms
Advanced Encryption Standard (AES)
good explanation of AES Rijndael.Dictionary of Algorithms and Data Structures
Definitions of algorithms, data structures, and classical Computer Science problems. Some entries have links to implementations and more information.Puzzle: Fast Bit Counting « Reflections
return ((tmp + (tmp >> 3)) & 030707070707) % 63;Guide to Getting Started in Machine Learning | A Beautiful WWW
Discussion of computer science publications. Embedded image coding using zerotrees of wavelet coefficients Posted by dcoetzee on July 8, 2009delicious blog » How SPEAR Identifies Domain Experts within Delicious
analyzing user behavior to find experts
SPEAR (Spamming-resistant Expertise Analysis and Ranking) is a new technique to measure the expertise of users by analyzing their public activities on platforms like Delicious.
"A major problem of the Internet today is that finding high quality information is not easy nor fast. The steady increase of spam and junk content on the Web further complicates this challenge. Another related issue is that finding knowledgeable and trustworthy users on social platforms like Delicious is much more difficult than it should be. Wouldn’t it be nice if Delicious recommended “good” users with similar interests? Or wouldn’t it be helpful if you could get a selection of great websites on jewelry or mortgage without being overwhelmed by spam? To tackle this problem, we created the SPEAR algorithm. SPEAR (Spamming-resistant Expertise Analysis and Ranking) is a new technique to measure the expertise of users by analyzing their public activities on platforms like Delicious. A great benefit of SPEAR is that it returns two very useful sets of results: first, a list of users ranked by their expertise; and second, a list of websites ranked by their quality."
good, but missing essential parts for recommendations for educational system.
SPEAR (Spamming-resistant Expertise Analysis and Ranking) is a new technique to measure the expertise of users by analyzing their public activities on platforms like DeliciousDamn Cool Algorithms: Spatial indexing with Quadtrees and Hilbert Curves - Nick's Blog
How to find the location of a particular point in a Hilbert curve. (via delicious popular)Summary of all the MIT Introduction to Algorithms lectures - good coders code, great reuse
"As you all may know, I watched and posted my lecture notes of the whole MIT Introduction to Algorithms course. In this post I want to summarize all the topics that were covered in the lectures and point out some of the most interesting things in them."Finally: Finger Trees! : Good Math, Bad Math
What finger trees do is give me a way of representing a list that has both the convenience of the traditional cons list, and the search efficiency of the array based method. The basic idea of the finger tree is amazingly simple. It's a balanced tree where you store all of the data in the leaves. The internal nodes are just a structure on which you can hang annotations, which you can use for optimizing search operations on the tree.
"The basic idea of the finger tree is amazingly simple. It's a balanced tree where you store all of the data in the leaves. The internal nodes are just a structure on which you can hang annotations, which you can use for optimizing search operations on the tree. What makes the finger tree so elegant is the way that some very smart people have generalized the idea of annotations to make finger trees into a single, easily customizable structure that's useful for so many different purposes: you customize the annotations that you're going to store in the internal nodes according to the main use of your tree." A commentator says regarding the article however, "Ørjan Johanse is right. You described a monoid-annotated-binary-tree, which is not enough to be a finger tree."List of Algorithms
Ullman Set: position[members[i]] = i
Ullman set, an excellent tutorialMonoids and Finger Trees
"A very powerful application of monoids are 2-3 finger trees, first described by Ralf Hinze and Ross Patterson. Basically, they allow you to write fast implementations for pretty much every abstract data type mentioned in Okasaki's book on purely functional data structures. For example, you can do sequences, priority queues, search trees and priority search queues. Moreover, any fancy and custom data structures like interval trees or something for stock trading are likely to be implementable in this framework as well. How can one tree be useful for so many different data structures? The answer: monoids! Namely, the finger tree works with elements that are related to a monoid, and all the different data structures mentioned above arise by different choices for this monoid. Let me explain how this monoid magic works."Ask Proggit: Recommender a compsci paper for me to read this weekend : programming
I've tried to span as many subjects as possible to have a little something for everyone while limiting myself to foundational papers that have had a lasting impact on the field and are also highly readable. Some of the people (Chomsky, Shannon, Metropolis, Ulam) represented here might not consider themselves computer scientists but the papers I've included have been so important that they cannot be left out. I admit a few papers may seem like idiosyncratic picks due to my particular interest in certain areas like computer graphics and computational dynamics. There are several important papers I couldn't include due to an absence of freely available copies, e.g. Rissanen's Generalized Kraft Inequality and Arithmetic Coding.
I am looking for something clever or thought provoking that doesn't depend on too much background knowledge, and is easy to read without too much formalism/maths.
Recommender a compsci paper for me to read this weekendAlgorithm Tutorials
GraphRuby Algorithms: Sorting, Trie & Heaps - igvita.com
Collection of some useful Ruby data structures all coded up and ready for use.New algorithm guesses SSNs using date and place of birth - Ars Technica
Given these numbers, the authors estimate that even a moderate-sized botnet of 10,000 machines could successfully obtain identity verifications for younger residents of West Virginia at a rate of 47 a minute.
Two researchers have found that a pair of antifraud methods intended to increase the chances of detecting bogus social security numbers has actually allowed the statistical reconstruction of the number using information that many people place on social networking sites.Trees In The Database - Advanced data structures
A presentation about modelling trees relationally and storing them in an SQL database.
Storing tree structures in a bi-dimensional table has always been problematic. The simplest tree models are usually quit
trees in databaseAlgorithm Library Design: Lecture Notes
Library design is language design. [Stroustrup] Course Goal To learn how to implement software libraries, such as STL, CGAL, LEDA, ..., that have a focus on algorithms and data structures. To learn advanced programming techniques in C++, such as templates, generic programming, object-oriented design, design patterns, and large-scale C++ software design.octo.py: quick and easy MapReduce for Python
octo.py: quick and easy MapReduce for Python
showcases an example of using the mapreduce system octo.pyThe C Programming Language: 4.10
4.10 Recursion
Similar to 'Pride and Prejudice and Vampires', here is an adaptation of a famous book in the computer science canon, rewritten in the style of H.P. Lovecraft, specifically 'The Shadow Over Innsmouth' (which I'm coincidentally also reading on my iPhone).
I never heard of C Recursion till the day before I saw it for the first and– so far– last time. They told me the steam train was the thing to take to Arkham; and it was only at the station ticket-office, when I demurred at the high fare, that I learned about C Recursion. The shrewd-faced agent, whose speech shewed him to be no local man, made a suggestion that none of my other informants had offered. "You could take that old bus, I suppose," he said with a certain hesitation. "It runs through C Recursion, so the people don't like it. I never seen more'n two or three people on it– nobody but them C folks."
Brian W Kernighan & Dennis M Ritchie & HP Lovecraft
void Cthulhu (int Ia) { if (Ia/10) Cthulhu (IA/10); putchar // ftagn! (Ia % 10 + '0'); } // neblod zin!Calculate exp() and log() Without Multiplications
patterns for parallel programmingDoom Classic code review.
Also http://fabiensanglard.net/doomIphone/index.php
Doom 1993 code reviewThe Knight's Tour
Pyth
The Knight's Tour is a mathematical problem involving a knight on a chessboard. The knight is placed on the empty board and, moving according to the rules of chess, must visit each square exactly once.
d at runtime g_board = [] # the board will be constructed as a list oOld-school programming techniques you probably don't miss
RT @estherschindler: I wrote: "Old-school programming techniques you probably don't miss" http://bit.ly/5BoWr - I disagree with some of it. [from http://twitter.com/nealrichter/statuses/1652091891]Some Stuff - Screaming Duck Software
A good idea of image compression based on genetic algorithms.Exclusive: How Google’s Algorithm Rules the Web | Magazine
Want to know how Google is about to change your life? Stop by the Ouagadougou conference room on a Thursday morning. It is here, at the Mountain View, California, headquarters of the world’s most powerful Internet company, that a room filled with three dozen engineers, product managers, and executives figure out how to make their search engine even smarter. This year, Google will introduce 550 or so improvements to its fabled algorithm, and each will be determined at a gathering just like this one. The decisions made at the weekly Search Quality Launch Meeting will wind up affecting the results you get when you use Google’s search engine to look for anything — “Samsung SF-755p printer,” “Ed Hardy MySpace layouts,” or maybe even “capital Burkina Faso,” which just happens to share its name with this conference room. Udi Manber, Google’s head of search since 2006, leads the proceedings. One by one, potential modifications are introduced, along with the results of months of testing in vari
Filosofisk (?) artikel om googles algoritmer
Want to know how Google is about to change your life? Stop by the Ouagadougou conference room on a Thursday morning. It is here, at the Mountain View, California, headquarters of the world’s most powerful Internet company, that a room filled with three dozen engineers, product managers, and executives figure out how to make their search engine even smarter. This year, Google will introduce 550 or so improvements to its fabled algorithm, and each will be determined at a gathering just like this one. The decisions made at the weekly Search Quality Launch Meeting will wind up affecting the results you get when you use Google’s search engine to look for anythingDealing with Duplicate Person Data - Proud to Use Perl
I've recently been working on a fairly large project that that has contact information for almost 2 million people. These records contain details for both online and offline actions. Since the data can come from multiple sources there exist many duplicate records. Duplicate records mean more processing for our code, more storage space and more hassle for our clients who have to deal with these duplicates. All in all, bad things to leave lying around. In this article we'll look at some strategies that I used to identify and remove these duplicates. All code in this article are samples, and we'll leave the task of assembling them into a final working program up to the reader. CPAN is your Friend Like all good Perl projects, we will make heavy use of the CPAN. It makes our lives so much easier and every day I'm more in awe at the quality and bredth of solutions I find there. For this project we'll be using Text::LevenshteinXS, Lingua::EN::Nickname and Parallel::ForkManager. What is a Du
Funny to see people still using perl these days but great examplere2 - Project Hosting on Google Code
"an efficient, principled regular expression library"
RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.Data Compression Explained
the best math books for the subject at hand
Mathematics is wonderful!12 Reasons To Be Learning Graph Theory
RT @Kellblog: 12 Reasons To Be Learning (or at least paying attention to) Graph Theory http://bit.ly/a1F9hY #linkeddata #rdf #eav #gtgraph-theory-algorithms-book - Project Hosting on Google Code
This is the eleventh post in an article series about MIT's lecture course Introduction to Algorithms. In this post I ...Prime Numbers and the Benford’s Law | Pyevolve
"Prime Numbers and the Benford's Law | Pyevolve" http://hub.tm/?RHOqX [from http://twitter.com/carreonG/statuses/1747034327]
Pyevolve - A complete genetic algorithm framework written in pure python15 Real-World Applications of Genetic Algorithms
Some of the most useful applications of genetic algorithms in the real world.ongoing · The Web Curriculum
Suggested by arun
[Found via Coast to Coast Bio] Tim Bray outlines a new CS curriculum that re-focuses on the web as a platform (rather than an individual computer). Under this training, CS students would graduate prepared for the modern challenges of working with big data.
The World Wide Web as a framework for structuring much of the academic Computer Science curriculum.
Viewing the World Wide Web as a framework for structuring part of the academic Computer Science (and Computer Engineering, perhaps) curriculum. Includes a link to "The first few milliseconds of a HTTPS connection" which should be as fascinating as a read.http://www.cs.nyu.edu/courses/fall08/G22.2965-001/geneticalgex
Although the configuration program specified tasks for all 100 cells, it transpired that only 32 were essential to the circuit's operation. Thompson could bypass the other cells without affecting it. A further five cells appeared to serve no logical purpose at all--there was no route of connections by which they could influence the output. And yet if he disconnected them, the circuit stopped working.
genetic fpga evolution/programming
CREATURES FROM PRIMORDIAL SILICON
apply evolution to digital FPGA
Using FPGAs to evolve solutions to problems.assertTrue( ): One of the toughest job-interview questions ever
I mentioned in a previous post that I once interviewed for a job at a well-known search company. One of the five people who interviewed me asked a question that resulted in an hour-long discussion: "Explain how you would develop a frequency-sorted list of the ten thousand most-used words in the English language." I'm not sure why anyone would ask that kind of question in the course of an interview for a technical writing job (it's more of a software-design kind of question), but it led to a lively discussion, and I still think it's one of the best technical-interview questions I've ever heard. Ask yourself: How would you answer that question?
I mentioned in a previous post that I once interviewed for a job at a well-known search company. One of the five people who interviewed me asked a question that resulted in an hour-long discussion: "Explain how you would develop a frequency-sorted list of the ten thousand most-used words in the English language."
The author talks about a question he got at a job interview, and goes on to provide a reasonable recap/discussion about hash tables. This is generally the kind of answer I look for when I ask similar questions. 9/10 candidates I talk with can't actually discuss a hash function, and don't know how to create one.Dynamic Programming Practice Problems
a collection of practice dynamic programming problems and their solutions.
This site contains a collection of practice dynamic programming problems and their solutions. The problems listed below are also available in a pdf handout. To view the solution to one of the problems below, click on its title. To view the solutions, you'll need a machine which can view Macromedia Flash animations and which has audio output. If you want, you can also view a quick review from recitation on how to solve the integer knapsack problem (with multiple copies of items allowed) using dynamic programming.
Nice problem examples.MIT’s Introduction to Algorithms, Lecture 3: Divide and Conquer - good coders code, great reuse
This is the second post in an article series about MIT's lecture course Introduction to Algorithms. I changed my mind ...A Sudoku Solver in Java implementing Knuth’s Dancing Links Algorithm
Dr. Donald Knuth’s Dancing Links Algorithm solves an Exact Cover situation. The Exact Cover problem can be extended to a variety of applications that need to fill constraints. Sudoku is one such special case of the Exact Cover problem.
See also the references, esp. Knuth's original paper.Artisan System - A PHP5 Object Oriented Framework
how phrases work in search indexesbaseplane - technology platforms » Big O Notation in Design Theory » baseplane - technology platforms
一些有用的代码收集
Frammenti di codice C per trucchetti
the bit twiddlerAutomated Day Trader: Double Moving Average Crossover, Test 1
Prediction API biedt mogelijkheden om bijv recommendations te doen op basis v historische data: http://bit.ly/c7z06p
Google Prediction APIMIT’s Introduction to Algorithms, Lecture 15: Dynamic Programming - good coders code, great reuse
This is the tenth post in an article series about MIT's lecture course Introduction to Algorithms. In this post I ...Constraint programming in Python — The Uswaretech Blog - Django Web Development
Calculation by handMethod Art
Andrei Alexandrescu
nice open source data framework for real time video effects
a video showing some realtime techniques for computer visionCS-TR-339 Computer Go Tech Report
An Introduction to the Computer Go Field and Associated Internet ResourcesHow Spellcheckers Work | PC Plus
As you can see, the process of checking spellings and suggesting corrections is not an exact science, but there\'s no denying that it has made our lives a little easier and our publications a little less unpredictable.
http://news.ycombinator.com/item?id=745537A Non-Mathematical Introduction to Using Neural Networks | Heaton Research
Interesting book on programming / development.COS 493, Spring 2002: Schedule and Readings
Algorithms for Massive Data SetsFirst replicating creature spawned in life simulator - physics-math - 16 June 2010 - New Scientist
IF YOU found a self-replicating organism living inside your computer, your first instinct might be to reach for the antivirus software. If, however, you are Andrew Wade, an avid player in the two-dimensional, mathematical universe known as the Game of Life, such a discovery is nothing short of an epiphany.
First replicating creature spawned in life simulator
Posted by SupybotLess Wrong: Bayes' Theorem Illustrated (My Way)
Great illustration.Accurately computing running variance
The most direct way of computing sample variance or standard deviation can have severe numerical problems. [...] There is a way to compute variance that is more accurate and is guaranteed to always give positive results. Furthermore, the method computes a running variance. That is, the method computes the variance as the x's arrive one at a time. The data do not need to be saved for a second pass.
"This better way of computing variance goes back to a 1962 paper by B. P. Welford and is presented in Donald Knuth's Art of Computer Programming, Vol 2, page 232, 3rd edition. Although this solution has been known for decades, not enough people know about it. Most people are probably unaware that computing sample variance can be difficult until the first time they compute a standard deviation and get an exception for taking the square root of a negative number. It is not obvious that the method is correct even in exact arithmetic. It's even less obvious that the method has superior numerical properties, but it does."
A simple way to compute running sample variance (standard deviation).
Computing mean, variance and standard deviation on a stream of data.The Most Important Algorithms (Survey)
Quick, what is the fastest way to search a sorted array? Binary search, right? Wrong. There is actually a method called interpolation search, in which, rather than pessimistically looking in the middle of the array, you use a model of the key distribution to predict the location of the key and look there.
Interploating search with alogrithmUnderstanding and Applying Operational Transformation - Code Commit
@djspiewak wrote a very detailed intro to operational transformation. Very useful for building, say, a collab editor
Almost exactly a year ago, Google made one of the most remarkable press releases in the Web 2.0 era. Of course, by “press release”, I actually mean keynote at their own conference, and by “remarkable” I mean potentially-transformative and groundbreaking. I am referring of course to the announcement of Google Wave, a real-time collaboration tool which has been in open beta for the last several months.
Good article explaining how the Operational Transform from Google Wave can be implemented, and the various cases that have to be handled when server and client both have edits pending.
The algorithm behind "Wave"MIT OpenCourseWare | Electrical Engineering and Computer Science | 6.042J Mathematics for Computer Science, Spring 2005 | Lecture Notes
Mathematics for Computer SciencePlain english explanation of Big O - Stack Overflow
One of the best laypersons explanation of algorithm complexity that I've seen.
Traditional computers can solve problems in polynomial time. Certain things are used in the world because of this. Public Key Cryptography is a prime example. It is computationally hard to find two prime factors of a very large number. If it wasn't, we couldn't use the public key systems we use.
Stack Overflow post about Big O notationPlain english explanation of Big O - Stack Overflow
I recently programmed the AI for the World Series of Poker, developed by Left Field Productions and published by Activision. I started out thinking it would be an easy task. But it proved a lot more complex than I initially thought.
I recently programmed the AI for the World Series of Poker, developed by Left Field Productions and published by Activision. I started out thinking it would be an easy task. But it proved a lot more complex than I initially thought. This article for the budding poker AI programmer provides a foundation for a simple implementation of No-Limit Texas Holdem Poker AI, covering the basics of hand strength evaluation and betting. By following the recipe set out here, you will quickly become able to implement a reasonably strong poker AI, and have a solid foundation on which to build. I assume you are familiar with the basic terminology of poker.Cowboy Programming » Programming Poker AI
I recently programmed the AI for the World Series of Poker, developed by Left Field Productions and published by Activision. I started out thinking it would be an easy task. But it proved a lot more complex than I initially thought.A 10-MINUTE DESCRIPTION OF HOW JUDY ARRAYS WORK AND WHY THEY ARE SO FAST
As the inventor of the Judy algorithm I've been asked repeatedly, "What makes Judy so fast?" The answer is not simple, but finally I can share all of the details.
A complex (to implement) but efficient scalable data-structure that obtains very high performance by minimising the number of cache-line fills required.A Coder's Musings: Curve fitting with Pyevolve
Genetic algorithms with Python
genetic algorithm lib useA 10-MINUTE DESCRIPTION OF HOW JUDY ARRAYS WORK AND WHY THEY ARE SO FAST
As the inventor of the Judy algorithm I've been asked repeatedly, "What makes Judy so fast?" The answer is not simple, but finally I can share all of the details.
A complex (to implement) but efficient scalable data-structure that obtains very high performance by minimising the number of cache-line fills required.