Posts tagged ‘So Eric Remembers’

Stupid Linq Tricks: Record Rollup

I’ve been in these situations too many times where I’ve had to receive a logical document and the “text” has been split across multiple items in a list. Most of the time it occurs with databases but I’ve had to do the same thing with B2B integrations with Xml too.

Before Linq I’d end up with the prototypical foreach code but this last time I need to (re)solve such a problem Linq had since been born.

Here is an example:

//Source object where depending on the length of the text
public class RawRecords
{
  public int Id { get; set; }
  public int Sequence { get; set; }
  public DateTime Timestamp { get; set; }
  public string Comment { get; set; }
}
//Destination object where you want the multiple comment lines to be rolled up into the destination CommentText property.
public class Comment
{
  public int CommentId { get; set; }
  public DateTime CommentTimestamp { get; set; }
  public string CommentText { get; set; }
}
[TestMethod]
public void single_comment_rollup()
{
    IEnumerable<RawRecords> rawRecords = GetSampleRow();
    const string EXPECTED = "Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas.";

    var actualComment = new Comment
    {
        CommentId = rawRecords.First().Id,
        CommentTimestamp = rawRecords.First().Timestamp,
        CommentText = string.Join("", rawRecords.OrderBy(ord => ord.Sequence).Select(x => x.Comment))
    };

    Assert.AreEqual(EXPECTED, actualComment.CommentText);
}

Big deal. What if you have a set of data that has multiple comments

[TestMethod]
public void multiple_comment_rollup()
{
    IEnumerable<RawRecords> rawRecords = GetSampleRows();
    const string EXPECTED_COMMENT_42 = "Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas.";
    const string EXPECTED_COMMENT_69 = "Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante.";

    IEnumerable<Comment> actualComments =
        from ncr in rawRecords
        group ncr by new { ncr.Id, ncr.Timestamp }
        into g
        select new Comment
        {
            CommentId = g.Key.Id,
            CommentTimestamp = g.Key.Timestamp,
            CommentText = string.Join("", g.OrderBy(x => x.Sequence).Select(x => x.Comment))
        };

    //If you have multiple asserts in your normal tests, well you know what will happen
    Assert.AreEqual(EXPECTED_COMMENT_42, actualComments.Single(x => x.CommentId == 42).CommentText);
    Assert.AreEqual(EXPECTED_COMMENT_69, actualComments.Single(x => x.CommentId == 69).CommentText);
}

Here are the static helper methods I used to kruft up some sample data if you would like to play around with these queries

private static IEnumerable<RawRecords> GetSampleRow()
{
    var timestamp1 = new DateTime(2010, 3, 12);
    return new List<RawRecords>
               {
                   new RawRecords { Sequence = 4, Id = 42,  Timestamp = timestamp1, Comment = "et malesuada fames " },
                   new RawRecords { Sequence = 1, Id = 42,  Timestamp = timestamp1, Comment = "Pellentesque habitant " },
                   new RawRecords { Sequence = 3, Id = 42,  Timestamp = timestamp1, Comment = "senectus et netus " },
                   new RawRecords { Sequence = 5, Id = 42,  Timestamp = timestamp1, Comment = "ac turpis egestas." },
                   new RawRecords { Sequence = 2, Id = 42,  Timestamp = timestamp1, Comment = "morbi tristique " },
               };
}

private static IEnumerable<RawRecords> GetSampleRows()
{
    var timestamp1 = new DateTime(2010, 3, 12);
    var timestamp2 = new DateTime(2010, 3, 13);
    return new List<RawRecords>
               {
                   new RawRecords { Sequence = 4, Id = 42,  Timestamp = timestamp1, Comment = "et malesuada fames " },
                   new RawRecords { Sequence = 3, Id = 42,  Timestamp = timestamp1, Comment = "senectus et netus " },
                   new RawRecords { Sequence = 4, Id = 69,  Timestamp = timestamp2, Comment = "ante." },
                   new RawRecords { Sequence = 2, Id = 42,  Timestamp = timestamp1, Comment = "morbi tristique " },
                   new RawRecords { Sequence = 1, Id = 42,  Timestamp = timestamp1, Comment = "Pellentesque habitant " },
                   new RawRecords { Sequence = 1, Id = 69,  Timestamp = timestamp2, Comment = "Vestibulum tortor quam, " },
                   new RawRecords { Sequence = 2, Id = 69,  Timestamp = timestamp2, Comment = "feugiat vitae, ultricies " },
                   new RawRecords { Sequence = 3, Id = 69,  Timestamp = timestamp2, Comment = "eget, tempor sit amet, " },
                   new RawRecords { Sequence = 5, Id = 42,  Timestamp = timestamp1, Comment = "ac turpis egestas." },
               };
}

I do realized that ORM and other database queries have the ability to do these aggregates for you. I was merely trying to point out some Linq tricks for those situations where you are dealing with objects and might not be in a data access situation.

Checking for special characters using LINQ

wussy

Why on earth am I posting this?  For the simple reason of hoping if I take the time to put it down on “paper” that I will never have a need to use such a thing again.  Call it superstition.

Here is some specific background of why this came about.  A scenario came up where I was migrating some test data from one platform to another.  Some of the source data was encrypted (even in test system) and for my destination system I didn’t care about that data anyway.  If I came across any of that encrypted information and would just put blanks in the destination.  So in short if a string has special characters, nuke the data and write an empty string.

For what ever reason I wasn’t in a Regex mood and looked at LINQ for a solution.  Plus it is migration code so who really cares how optimal it is.

The first two lines of this extension method are something I read about and thought, “neat and when would I ever use it”.  Basically Enumerable.Range will give you a range of of numbers with a starting number and a count.

IEnumerable lowRange = Enumerable.Range(0, 32);

The next albeit messy part takes the input string and converts it to a character array then takes each character and converts it to is ascii code representation.

IEnumerable enumerable = value.ToCharArray().Select(x => Convert.ToInt32(((int)x).ToString()));

The last bit was even more obscure than the range bit. Enumerable.Intersect “Produces the set intersection of two sequences.”  If you have two lists, {1,2,3} and {3,4,5} the resultant list will be 3 with a .Count() of 1.

public static class StringExtensions
{
  public static bool HasInvalidCharacters(this string value)
  {
    //Ascii range between 32 and 127
    IEnumerable<int> lowRange = Enumerable.Range(0, 32);
    IEnumerable<int> highRange = Enumerable.Range(128, 128);
    IEnumerable<int> enumerable = value.ToCharArray().Select(x => Convert.ToInt32(((int)x).ToString()));
    return lowRange.Intersect(enumerable).Count() != 0 || highRange.Intersect(enumerable).Count() != 0;
  }
}

This was may more of a post than I hoped but hopefully someone will find it useful and I’ll will never have to think about it again!