ie8 fix
Click Here
madison

Algebraix: A new approach to querying huge databases

By | April 8, 2010, 3:31pm PDT

Summary: Algebraix says it has a new approach to querying very large databases. If it can do what it says, it is one very hot startup…

I met with Charles Silver, the CEO of Algebraix Data, a startup based in San Diego. He was telling me about his company’s approach to querying large databases, very quickly, and with no need for prior indexing, or construction of data models.

It all sounded too good to be true, I told him.

If Algebraix can do what it says it can, it is a very hot startup.

Here are some notes from our conversation:

- The founders are maths professors and they’ve created an ‘algebra’ for querying databases.

- It works on any size database, we haven’t yet found a limit.

- It works on any data in any database, including unstructured data.

- It goes beyond relational databases, which are based on rows and columns, essentially giant spreadsheets.

- Usually, querying large databases can be a multi-million dollar project to create the right data sets and then the testing. No prior data modeling is required with our approach.

- We’ve just begun to talk to potential enterprise customers in the past six weeks.

- The CIA and Navy were our first customers and we now have pilots running at some large financial services companies.

- It can be used to power other applications. It works with standard SQL and runs in near-real time.

- You could use it to mirror any application.

- It is based on extended set theory.

- We’ve raised a total of $12m — all from angels.

- We have 22 employees.

- It’s expensive. Our list price is based on the size of the data and is $100,000 for the first Terabyte.

- It runs on any Linux based hardware system. Our development center is in Austin, Texas, and Dell has given us a bunch of machines to use.

- The size of the application is very small just 7 MBytes.

- I used to run RealAge, which collected lifestyle data on millions of people and recommended pharmaceuticals. I sold it to Hearst in September 2007.

- I was an investor then joined as Chairman and took over as CEO in the fall.

- We have four patents on the technology. Getting software patents is not as easy as it once used to be.

- People don’t believe it. Even our own developers don’t believe that it does what it does.

The Algebraix approach technology across different databases, it could become a perfect systems integrator technology.

- - -
Please see:

Our technology | Algebraix Data, formerly Xsprada

Is The Relational Database Finally Finished? | The Virtual Circle

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Tom Foremski reports on the business and culture of Silicon Valley at the intersection of technology and media.

Disclosure

Tom Foremski

Tom Foremski is the editor and publisher of Silicon Valley Watcher and Silicon Valley Watch. Tibco Software is an advertiser.

Biography

Tom Foremski

In May 2004, Tom Foremski became the first journalist to leave a major newspaper, the Financial Times, to make a living as a full-time journalist blogger. He writes the popular news blog Silicon Valley Watcher--reporting on the business of Silicon Valley.

Tom arrived in San Francisco in 1984, and has covered US technology markets for leading computer journals around the world.

Related Discussions on TechRepublic

Did you know you can take part in these discussions with your ZDNet membership?
12
Comments

Join the conversation!

Just In

RE: Algebraix: A new approach to querying huge databases
yantangseo 17th Sep
@Zathros
thank you for sharing!! replica watches ^^
0 Votes
+ -
Something missing
Zathros 8th Apr 2010
And response times are...?
@Zathros
thank you for sharing!! replica watches ^^
0 Votes
+ -
that's good
Linux Geek 9th Apr 2010
as long as it is not some proprietary crap like M$ Linq I'm fine with it.
0 Votes
+ -
Also missing
TxM2xTx Updated - 9th Apr 2010
What exactly do the CIA and Navy use it for ?
0 Votes
+ -
secret stuff
cwallen19803@... 9th Apr 2010
It's a mystery to you what the Navy or CIA do?
0 Votes
+ -
No substance whatsoever in this article. Is ZDNet now in the business of putting lipstick on pigs to lure investors?

I hope the SEC reads this.
Let's see, unstructured data (lets call them pages), no data models (but the pages have identified fields), b'zillions of records (pages). Hem, this all sounds familar. Does the system build indices by any chance? On a document (Google) or word (Saffron) basis?
0 Votes
+ -
but it would be done on with a completely new paradigm for data structures.

It would have some similarities to SQL, but any part of the data could be directly or indirectly accessed.

However, to develop it, I'll be needing between 10 and 100 million dollars and the fastest computers that can be built.

Anybody got that kind of money laying around.

I'm kidding about the money, but I really have been brainstorming the idea.
Rows and columns are an implementation detail. A relation is a
set of n-ary tuples.

Performing a query against a database is finding a subset of
the data through union and intersection operations. The best
performance is on indexed data (when one leaves the
abstraction world of sets and gets into the efficiency world of
algorithms).

One could argue that perhaps the smart thing is to make index
creation an on first demand activity, with the index optimized
for the query, and subsequent similar queries reusing the
index, minus data that has left the database plus the indexing
on data that is new since the last time the query ran.

This could be called mapReduce and it's what Google does.
Well, I don't know about the caching, because the dynamism of
the information on the web is about as maximized as any
human activity may be. Google also runs their queries in
parallel.

Math professors using extended sets? Even our engineers don't
believe it. We've convinced the the military to pay us for it.

Okay.

Operating as I am in the kilo- magnitude, and armed with a
fundamental grounding in SQL, a fully licensed deployment of
postgresql, and a util library of some Haskell functions which
express the essence of mapReduce in about 40 lines of code
(for the databases that are not really indexable due to
dynamism or field heterogeneity), I'm intrigued but I'll can wait
for full enlightenment. I'd say it'd be the day they trim the 5th
zero off of that $100,000 price tag.
0 Votes
+ -
heh, math profs, eh?
CobraA1 11th Apr 2010
"The founders are maths professors and they?ve created
an ?algebra? for querying databases."

I've seen my share of math profs that theorize about
computer algorithms.

Most of them have no idea how long operations actually
take inside a computers, or the tradeoffs involved.

"It is based on extended set theory."

On really large databases, set theory can blow up if
you're not aware of the size/speed tradeoffs involved.
Some operations, like the Cartesian product, can
really blow up the size of your data in a large
database.

It sounds like all this product really does is to take
the side of size in the size/speed tradeoff. Which
means this could kill the memory or drive space of a
machine if your database works with a lot of data.

"People don?t believe it. Even our own developers
don?t believe that it does what it does."

I take this as a possible sign of snake oil, not as a
sign that this is really a fantastic technology.

"getting software patents is not as easy as it once
used to be."

That's good news, because frankly patents are way too
easy to get, and in the past hurt the software
industry more than they helped.
0 Votes
+ -
I agree, this article is sparse on details and makes it sound like smoke and mirrors. To evaluate the technology look at the patents.

They are assigned to Xsprada Corporation (former name of Algebraix) and there are six listed: patent numbers 2007/0266000, 2007/0276784, 2007/0276785, 2007/0276786, 2007/0276787, and 2007/0276802. On the surface they are all very much alike, an example is at http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220070266000%22.PGNR.&OS=DN/20070266000&RS=DN/20070266000 .

I can't say I'm an expert, but glancing over the patents it appears they are creating some kind of generalized hash called a "Global Unique Identifier (GUID)" that can represent data as well as the algebraic relations between the data. There's all the detail you could want there; whether or not it all makes sense is beyond my abilities to judge.
0 Votes
+ -
Algebraix and extended set theory
KenNorth Updated - 16th Apr 2010
Algebraix Data has built a product on the foundation of David Childs' extended set theory, which dates back to research first published in the 1960s. Dr. E.F. Codd cited one of Childs' papers in his 1970 paper about the relational model for data. Codd's paper was the milestone behind creation of today's SQL database industry. There's more detail and links to Childs' research here:
http://www.drdobbs.com/blog/archives/2010/03/data_models_acc.html

Childs most recent paper (March 2010) discusses the concept of a mathematical identity for data and the advantage of a set-store architecture - discussed in "Information Density, Mathematical Identity, Set Stores and Big Data":

http://www.drdobbs.com/blog/archives/2010/03/information_den.html

There have been independent benchmarks of extended set processing software versus SQL products for processing analytical (TPC-H) queries. The performance differences have been dramatic - Algebraix Data is barking up the right tree.

Join the conversation!

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]
ie8 fix
Click Here
ie8 fix

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

White Papers, Webcasts, & Resources
ie8 fix
ie8 fix