Many products are available as open source or proprietary products that can
handle Big Data. Which one is best fit for this task?
Today's classic RDBMSs and tools are able to quickly load the data, process
it and present results in an easy to understand format. You can use SQL
or programmatic interface to process the data randomly or in batch; RDBMS's
keep data safe, protected against hardware and software failures.
Standards tools and products are not able to cope with Big Data requirement,
which is not dissimilar to what is involved in processing today's regular
data sets, just on a much bigger scale. Mainstream companies like telcos,
financials, web companies as well as government are reaching the limit of
what can be efficiently processed by classic RDBMS techhnologies.
When it comes to picking a proper platform and tools to handle your Big Data
there are a cou... (more)
Classic, proprietary relational database management systems are slowly
drifting towards where elephants end up. They are all based on decades old
code base (System R) that was designed for then-prevalent single-node, disk
based computer architectures. Classic, proprietary RDBMSs are going nowhere
- meaning both they are here to stay, similarly to how mainframes and COBOL
are still sticking around, and also meaning that new, modern companies and
startups usually start with LAMP stack (open source), then eventually
progress to NoSQL databases to address MySQL scalability issues. In... (more)
Amazon recently added Oracle database hosting capabilities to its RDS service
offering. You can rent an Oracle database related infrastructure in a
pay-as-you-go fashion now. We are going to explore if corporations should be
utilizing Amazon AWS Oracle Database related services (EC2, RDS ), how it
should be used, where possible savings and potential trouble points are. With
services like Amazon AWS it doesn't matter where your hardware and software
physically is - it could be in a room next to you or in some other country.
It is much easier and cheaper to procure and get new serv... (more)
Vertica is high-performing, advanced RDBMS that is very simple to install and
administer, thanks to the its modern design and purpose built architecture.
Once we execute all preparatory steps on database servers and download
Vertica software as per Installation Guide, we are starting installation
process on a two node cluster (host01, host02):
/opt/vertica/sbin/install_vertica -s host01,host02 -r
vertica-ce-5.1.1-0.x86_64.RHEL5.rpm
We initiate database creation process using dbadmin tool:
$ /opt/vertica/bin/adminTools
We will pick option 6 ( Configuration Menu ), then option 1 to... (more)
Hadoop is designed to store extremely large volumes of data. HBase, an open
source NoSQL data store, makes it possible to randomly access such large data
sets. HBase is included in Cloudera's Hadoop distribution.
One of the major obstacles to a wider adoption of NoSQL databases is the lack
of query languages, i.e., lack of comprehensive non-programmatic interfaces
to data inside NoSQL data store. We expect NoSQL databases to come up with
such query languages in near future. In meantime, Quest's Toad for Cloud
fills this gap and makes it easy to seamlessly access NoSQL, Cloud and ... (more)