Dapper caches query information like SQL statements and parameters to improve performance when materializing objects from query results. The cache is stored in a ConcurrentDictionary that is never flushed, so it could cause memory issues with dynamically-generated SQL. Queries using parameters are preferred since the cache key depends on the SQL and parameters, allowing caching of the execution plan. Buffering determines if all rows are loaded into memory before iterating. QueryMultiple is used for queries returning multiple result sets. Dirty tracking with interfaces allows Dapper to detect whether updates actually changed data to skip unnecessary SQL generation.
3. Query Caching
• Dapper caches information about every query it runs, this
allow it to materialize objects quickly and process parameters
quickly.
• The current implementation caches this information in a
ConcurrentDictionary object.
static readonly ConcurrentDictionary<Identity, CacheInfo> _queryCache =
new ConcurrentDictionary<Identity, CacheInfo>();
• The objects it stores are never flushed.
• If you are generating SQL strings on the fly without using
parameters it is possible you will hit memory issues.
• Each query you issue will create an Identity, depending on the
SQL query, its command type and its parameters.
• The CacheInfo object contains IDataReader and IDBCommand
functions and some counters which limit the cached amount.
4. Cont…
• The Identity class which is used for caching will look like as
mentioned in the below slide,
5. private Identity(string sql, CommandType? commandType, string connectionString, Type type, Type
parametersType, Type[] otherTypes, int gridIndex)
{
this.sql = sql;
this.commandType = commandType;
this.connectionString = connectionString;
this.type = type;
this.parametersType = parametersType;
this.gridIndex = gridIndex;
unchecked
{
hashCode = 17; // we *know* we are using this in a dictionary, so pre-compute this
hashCode = hashCode * 23 + commandType.GetHashCode();
hashCode = hashCode * 23 + gridIndex.GetHashCode();
hashCode = hashCode * 23 + (sql == null ? 0 : sql.GetHashCode());
hashCode = hashCode * 23 + (type == null ? 0 : type.GetHashCode());
if (otherTypes != null)
{
foreach (var t in otherTypes)
{ hashCode = hashCode * 23 + (t == null ? 0 : t.GetHashCode());
}
}
hashCode = hashCode * 23 + (connectionString == null ? 0 : connectionString.GetHashCode());
hashCode = hashCode * 23 + (parametersType == null ? 0 : parametersType.GetHashCode());
}
}
6. CacheInfo Class
class CacheInfo
{
public Func<IDataReader, object> Deserializer { get; set; }
public Func<IDataReader, object>[] OtherDeserializers { get; set; }
public Action<IDbCommand, object> ParamReader { get; set; }
private int hitCount;
public int GetHitCount() { return Interlocked.CompareExchange(ref hitCount, 0, 0); }
public void RecordHit() { Interlocked.Increment(ref hitCount); }
}
7. Note on Caching
• Use this
string s = "SELECT email, passwd, login_id, full_name " + "FROM members WHERE " + "email =
@email";
SqlCommand cmd = new SqlCommand(s); cmd.Parameters.Add("@email", email);
• Instead of
cmd.CommandText = "SELECT email, passwd, login_id, full_name " + "FROM members " +
"WHERE email = '" + email + "'";
• The first one is parameterized. It will be cached once. The
second one is not parameterized. It will be cached every time
you write a query like it with a different value for email. This
will explode your memory. (+)
• The first one is vastly superior. It avoids injection attacks.
dapper can cache it once. SQL Server will compile the
execution plan once and cache it. (+)
8. Buffering
• The buffer is unrelated to cache.
• Dapper does not include any kind of data-cache (although it
does have a cache related to the way how it processes
commands, i.e. "this command string, with this type of
parameter, and this type of entity - has these associated
dynamically generated methods to configure the command
and populate the objects").
• It’s a bool value, supplied against each command object. By
default buffering set to true.
9. Buffer = true
• In a buffered API all the rows are read before anything is
yielded.
• When dealing with the limited no of rows for example 100 or
200. So that it consumes less memory.
• Once you get the data, the command is complete - so there is
no conflict between that and subsequent operations. (+)
• It doesn’t not hold the active connection for a long time. (+)
• As soon as you get the data, the command has already
released any resources (locks etc), so you're having minimal
impact on the server. (+)
• If the query is immense, loading them all into memory (in a
list) could be expensive / impossible. (-)
• High latency time. (-)
10. Buffer = false
• If at all dealing with large amount of data (thousands to
millions of rows). It consumes lot of memory for storing the
buffered data.
• you can iterate over immense queries (many millions of rows),
without needing them all in-memory at once - since you're
only ever really looking at the current row being yielded. (+)
• In a streaming API each element is yielded individually. This is
very memory efficient, but if you do lots of subsequent
processing per item, mean that your connection / command
could be "active" for an extended time. (+)
• You don't need to wait for the end of the data to start iterating
- as soon as it has at least one row. (+)
11. Cont…
• The connection is in-use while you're iterating, which can lead
to "there is already an open reader on the connection" (or
whatever the exact wording is) errors if you try to invoke other
commands on a per-row basis (this can be mitigated by
MARS). (-)
• It holds the active connection for long time, when deals with
large amount of data. (-)
12. Query Vs QueryMultiple
• We need to choose the right one whether to use Query or
QueryMultiple.
• Its completely based upon the no of resultsets expected from
the command. If expected is more than one resultset we must
use QueryMultiple. If not we must use Query.
• QueryMultiple has some more additional logic. It applies some
amount of complication.
13. QueryMultiple Example
var sql =
@"
select * from Customers where CustomerId = @id
select * from Orders where CustomerId = @id
select * from Returns where CustomerId = @id";
using (var multi = connection.QueryMultiple(sql, new { id = value }))
{
var customer = multi.Read<Customer>().Single();
var orders = multi.Read<Order>().ToList();
var returns = multi.Read<Return>().ToList();
}
14. Dirty Tracking
• Dapper .net provides a nice feature to determine if the update
statement is really required. If the value does not change, it
won’t generate the SQL statement, which is very handy
performance optimization.
• The only requirement is we need to declare a interface for the
object.
15. Update – Without Tracking
using (var sqlConnection = new
SqlConnection(Constant.DatabaseConnection))
{
sqlConnection.Open();
var entity = sqlConnection.Get(9);
entity.ContactName = "John Smith";
sqlConnection.Update(entity);
var result = sqlConnection.Get(9);
}
16. Update – With Tracking
public interface ISupplier
{
int Id { get; set; }
string CompanyName { get; set; }
string ContactName { get; set; }
string ContactTitle { get; set; }
}
public class Supplier : ISupplier
{
public int Id { get; set; }
public string CompanyName { get; set; }
public string ContactName { get; set; }
public string ContactTitle { get; set; }
}
17. Cont…
using (var sqlConnection = new SqlConnection(conString))
{
sqlConnection.Open();
var supplier = sqlConnection.Get(9);
Console.WriteLine(string.Format("IsUpdated {0}",
sqlConnection.Update(supplier)));
supplier.CompanyName = “NewManning";
Console.WriteLine(string.Format("IsUpdated {0}",
sqlConnection.Update(supplier)));
}