Difference between revisions of "Portable SQL/Tips/Replacing records"

From Linuxnetworks
Jump to: navigation, search
(Replacing records)
 
(performance & conclusion)
Line 1: Line 1:
 
Often you would like to insert a new record or update an existing record if the record already exists.  
 
Often you would like to insert a new record or update an existing record if the record already exists.  
  
== The usual way ==
+
== The traditional way ==
  
 
Only MySQL and Oracle provide convenient statements like REPLACE to do this in one step but they are all vendor specific. So many applications use a SELECT statement to decide if an INSERT or UPDATE statement is necessary:
 
Only MySQL and Oracle provide convenient statements like REPLACE to do this in one step but they are all vendor specific. So many applications use a SELECT statement to decide if an INSERT or UPDATE statement is necessary:
Line 33: Line 33:
 
     exec( "UPDATE ''table'' SET ''name'' = 'test' WHERE ''id'' = 1" );
 
     exec( "UPDATE ''table'' SET ''name'' = 'test' WHERE ''id'' = 1" );
 
  }
 
  }
 +
 +
== Performance ==
 +
 +
There are two corner cases which are interesting from the performance point of view: All records are new and must be inserted or all records are already available in the database table.
 +
 +
=== Case 1: All records are new ===
 +
 +
In the traditional approach with a combination with SELECT and INSERT, the application must execute two queries per record. The SELECT query never returns a result in this case and the INSERT is always successful. Therefore, the complexity per record ''n'' is
 +
 +
O(n) = 2n
 +
 +
The alternative only needs to insert the new record without the need of any further actions. The UPDATE statement is never executed as the INSERT statement is always successfull. Thus, the complexity is only
 +
 +
O(n) = n
 +
 +
which is twice as fast compared to the traditional approach.
 +
 +
=== Case 2: All records are already available ===
 +
 +
The SELECT statement always returns the ID of an already existing record in the traditional approach which can be used as input for the UPDATE statement. Again, we always need two statements per record so the complexity is
 +
 +
O(n) = 2n
 +
 +
The INSERT fails in this case and the UPDATE statement is always executed in the alternative approach, so we also need two statements per records:
 +
 +
O(n) = 2n
 +
 +
Therefore, there is no performance gain in this case between the two approaches.
 +
 +
== Conclusion ==
 +
 +
Always executing the INSERT statement before doing an UPDATE if necessary simplifies the application logic by reducing the number of required statements. Furthermore, it increases the performance if not all records are already available. In this case the performance is the same as in the traditional approach.
 +
 +
 +
----
 +
Back to [[Portable SQL|Overview]]

Revision as of 23:34, 18 August 2007

Often you would like to insert a new record or update an existing record if the record already exists.

The traditional way

Only MySQL and Oracle provide convenient statements like REPLACE to do this in one step but they are all vendor specific. So many applications use a SELECT statement to decide if an INSERT or UPDATE statement is necessary:

if( exec( "SELECT id FROM table WHERE name = 'test'" ) )
{
  exec( "INSERT INTO table (id, name) VALUES (1, 'test')" );
}
else
{
  exec( "UPDATE table SET name='test' WHERE id = 1" );
}

The alternative

Provided your tables use proper unique indices, there is a simpler way handling this in the application. Instead, of doing a lookup first, always execute the INSERT statement and check for errors. If none has occurred, the new record was inserted successfully. Otherwise, there is already a records with the same ID available and an UPDATE is required instead:

if( exec( "INSERT INTO table (id, name) VALUES (1, 'test')" ) )
{
  exec( "UPDATE table SET name = 'test' WHERE id = 1" );
}

This is also very handy if your used database layer (OpenDBX, PDO) support throwing exceptions in case an error occurred:

try
{
   exec( "INSERT INTO table (id, name) VALUES (1, 'test')" );
}
catch( Exception $e )
{
   exec( "UPDATE table SET name = 'test' WHERE id = 1" );
}

Performance

There are two corner cases which are interesting from the performance point of view: All records are new and must be inserted or all records are already available in the database table.

Case 1: All records are new

In the traditional approach with a combination with SELECT and INSERT, the application must execute two queries per record. The SELECT query never returns a result in this case and the INSERT is always successful. Therefore, the complexity per record n is

O(n) = 2n

The alternative only needs to insert the new record without the need of any further actions. The UPDATE statement is never executed as the INSERT statement is always successfull. Thus, the complexity is only

O(n) = n

which is twice as fast compared to the traditional approach.

Case 2: All records are already available

The SELECT statement always returns the ID of an already existing record in the traditional approach which can be used as input for the UPDATE statement. Again, we always need two statements per record so the complexity is

O(n) = 2n

The INSERT fails in this case and the UPDATE statement is always executed in the alternative approach, so we also need two statements per records:

O(n) = 2n

Therefore, there is no performance gain in this case between the two approaches.

Conclusion

Always executing the INSERT statement before doing an UPDATE if necessary simplifies the application logic by reducing the number of required statements. Furthermore, it increases the performance if not all records are already available. In this case the performance is the same as in the traditional approach.



Back to Overview