Showing entries 111 to 120 of 1253
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: Databases (reset)
Never use floats for money

UPDATE: Several people have commented that decimal(10,2) is not correct for money, since sometimes currencies go out to more than 2 decimal places. Others claimed that storing cents (or base unit) as integers make it simpler to perform calculations (thanks, Kevin Farley for your comment). Regardless of what you choose – don’t use floats for money. If you do use integers, I would include the base unit in the name to avoid confusion (AmountInCents).

Data types make all the difference in the world when you’re designing your database. The choices you make now will affect the quality of your data, as well as application performance. I’m going to focus on one issue in this article: why you should always use decimals to represent money. Let’s jump in and see why that’s true.

An example of floats gone wrong

Let’s use a really, really simplified accounting ledger. It’s just three fields, an entry id, …

[Read more]
WebScaleSQL RPMs for CentOS 6

Looks like this post was rather unclear. See the bottom for how to build the rpms quickly.

WebScaleSQL was announced last week. This looks like a good thing for MySQL as it provides a buildable version of MySQL which includes multiple patches from Facebook, Google, LinkedIn, and Twitter needed by large users of MySQL, patches which have not been incorporated into the upstream source tree.  Making this more visible will possibly encourage more of these patches to be brought into the code sooner.

The source is provided as a git repo at https://github.com/webscalesql/webscalesql-5.6 and as detailed at http://webscalesql.org/faq.html the documentation says there is currently no intention to provide binaries. …

[Read more]
Continuent Replication to Hadoop – Now in Stereo!

Hopefully by now you have already seen that we are working on Hadoop replication. I’m happy to say that it is going really well. I’ve managed to push a few terabytes of data and different data sets through into Hadoop on Cloudera, HortonWorks, and Amazon’s Elastic MapReduce (EMR). For those who have been following my long association with the IBM InfoSphere BigInsights Hadoop product, and I’m pleased to say that it’s working there too. I’ve had to adapt Robert’s original script to work with the different versions of the underlying Hadoop tools and systems to make it compatible. The actual performance and process is unchanged; you just use a different JS-based batchloader script to work with different tools.

Robert has also been simplifying some of the core functionality, such as configuring some fixed pre-determined formats, so you no longer have to explicitly set the field and record separators.

I’ve also been …

[Read more]
MySQL 5.6 GTIDs: Evaluation and Online Migration

A colleague and I have been looking at GTID on MySQL recently and you may be interested in the blog post that results from that. You can see it here. http://blog.booking.com/mysql-5.6-gtids-evaluation-and-online-migration.html.

 

Interviewing for a Database Developer

I work for a firm that’s heavily invested in SQL – a team that needs to have developers who know their way around relational databases and MySQL in particular. I want to show you how I run interviews for our development positions.

Method

Everybody has their own methods and opinions on how to conduct technical interviews. I’ve found that I generally dislike interviews that focus either on whiteboard puzzles or obscure technical details, since they don’t really show how well the candidate is at what really matters: building functioning, quality apps. I really like running the interview like we’re talking about the design for a new product. I want to figure out the requirements, mull over the data model, and write some simple queries to make sure we can show the data we need to.

This process should show two things: the candidate has a good enough grip on the MySQL database that they can comfortably build a system …

[Read more]
Real-Time Data Loading from MySQL to Hadoop using Tungsten Replicator 3.0 Webinar

To follow-up and describe some of the methods and techniques behind replicating into Hadoop from MySQL in real-time, and how this can be combined into your data workflow, Continuent are running a webinar with me presenting that will go over the details and provide a demo of the data replication process.

Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0

Hadoop is an increasingly popular means of analyzing transaction data from MySQL. Up until now mechanisms for moving data between MySQL and Hadoop have been rather limited. The new Continuent Tungsten Replicator 3.0 provides enterprise-quality replication from MySQL to Hadoop. Tungsten Replicator 3.0 is 100% open source, released under a GPL V2 license, and available for download at https://code.google.com/p/tungsten-replicator/. Continuent Tungsten handles MySQL transaction …

[Read more]
Parallel Extractor for Provisioning

Coming up as a new feature in Tungsten Replicator (and written by our replicator expert Stephane Giron) is the ability to provision a new database by using data from an existing database. This new feature comes in the form of a tool called the Parallel Extractor.

The principles are very simple. On the master side:

  • Start the master replicator offline.
  • Switch the replicator to the online provision state.
  • The master replicator pulls the data out of the existing database and writes that information into the Transaction History Log (THL). At this point, the normal replicator thread is not extracting events from the source database.
  • Once the parallel replication has completed, the replicator switches over to normal extraction mode, and starts writing change data into the THL.

On the slave side, the THL events are read as usual from the master and applied to the slave, but …

[Read more]
Tiny happy features in MySQL

I love it when software gives you elegant ways of solving your problem. Programming language designers make me feel like they care when they take the time to include succinct, powerful expressions. I’ve recently discovered some in new things in MySQL, as well as a few rediscoveries. This is the first five, and I’ll cover the next five in another article.

In

You’ve probably used the standard In operator before:

Select 'Oh yeah!' From dual Where 1 In (1,2,3);

As a side note, the dual table is just a dummy table that always returns one row. It’s useful for demonstrating language features or running experiments.

You can also use a subquery with In:

Select 1 From dual Where 1 In (Select 1);

The thing I discovered was that it’s not just scalar values: it’s actually comparing rows, so you can see if a row is present:

Select 1 From dual Where (1,2) In (Select 1,2); …
[Read more]
MC at Percona Live San Francisco 2014

Now I’m back in the MySQL fold, I’ve got the opportunity to speak at Percona Live again. I’ve always enjoyed speaking at this conference (back when it was known by another name…), although I need to up my game and do the 6 talks I did back in 2009.

On the Tuesday afternoon, tutorials day, I’m running a half-day session with my replication colleague Linas Virbalas. This will be similar to the session I did at Percona Live London, and cover some of the more advanced content on replication, including, but not limited to:

  • Filters
  • JavaScript Filtering
  • Some fun and practical filters
  • Heterogeneous replication from MySQL out to MongoDB, Vertica, Oracle and Hadoop

I might even choose to demo …

[Read more]
Inner vs. Outer Joins

I want to teach you the difference between an inner and an outer join. We first need to think about what a join is. Simply, it’s when you combine two tables to make a new one. You’re not physically creating a new table when you join them together, but for the purposes of the query, you are creating a new virtual table. Every row now has the columns from both tables. So if TableA has columns Col1 and Col2 and TableB has columns Col3 and Col4, when you join these two tables, you’ll get Col1, Col2, Col3, and Col4. Just as with any query, you have the option of including all columns or excluding some, as well as filtering out rows.

Inner join. A join is combining the rows from two tables. An inner join attempts to match up the two tables based on the criteria you specify in the query, and only returns the rows that match. If a row from the first table in the join matches two rows in the second table, then two rows will be …

[Read more]
Showing entries 111 to 120 of 1253
« 10 Newer Entries | 10 Older Entries »