.NET Technical bits: Details on LINQ Query Expressions

Wednesday, April 7, 2010

Details on LINQ Query Expressions

A query expression is a query expressed in query syntax. A query expression is a first-class language construct. It is just like any other expression and can be used in any context in which a C# expression is valid. A query expression consists of a set of clauses written in a declarative syntax similar to SQL or XQuery. Each clause in turn contains one or more C# expressions, and these expressions may themselves be either a query expression or contain a query expression.

A query expression must begin with a from clause and must end with a select or group clause. Between the first from clause and the last select or group clause, it can contain one or more of these optional clauses: where, orderby, join, let and even additional from clauses. You can also use the into keyword to enable the result of a join or group clause to serve as the source for additional query clauses in the same query expression.

• The Query Variable :

In LINQ, a query variable is any variable that stores a query instead of the results of a query. More specifically, a query variable is always an enumerable type that will produce a sequence of elements when it is iterated over in a foreach statement or a direct call to its IEnumerator.MoveNext method.
The following code example shows a simple query expression with one data source, one filtering clause, one ordering clause, and no transformation of the source elements. The select clause ends the query.

static void Main()
{
// Data source.
int[] scores = { 90, 71, 82, 93, 75, 82 };

// Query Expression.
IEnumerable scoreQuery = //query variable
from score in scores //required
where score > 80 // optional
orderby score descending // optional
select score; //must end with select or group

// Execute the query to produce the results
foreach (int testScore in scoreQuery)
{
Console.WriteLine(testScore);
}
}
// Outputs: 90 82 93 82


In the previous example, scoreQuery is a query variable, which is sometimes referred to as just a query. The query variable stores no actual result data, which is produced in the foreach loop. And when the foreach statement executes, the query results are not returned through the query variable scoreQuery. Rather, they are returned through the iteration variable testScore. The scoreQuery variable can be iterated in a second foreach loop. It will produce the same results as long as neither it nor the data source has been modified.

A query variable may store a query that is expressed in query syntax or method syntax, or a combination of the two. In the following examples, both queryMajorCities and queryMajorCities2 are query variables:


//Query syntax
IEnumerable queryMajorCities =
from city in cities
where city.Population > 100000
select city;

// Method-based syntax
IEnumerable queryMajorCities2 = cities.Where(c => c.Population > 100000);


• Starting a query expression:

A query expression must begin with a from clause. It specifies a data source together with a range variable. The range variable represents each successive element in the source sequence as the source sequence is being traversed. The range variable is strongly typed based on the type of elements in the data source. In the following example, because countries is an array of Country objects, the range variable is also typed as Country. Because the range variable is strongly typed, you can use the dot operator to access any available members of the type.

A query expression may contain multiple from clauses. Use additional from clauses when each element in the source sequence is itself a collection or contains a collection. For example, assume that you have a collection of Country objects, each of which contains a collection of City objects named Cities. To query the City objects in each Country, use two from clauses as shown here:

IEnumerable cityQuery =
from country in countries
from city in country.Cities
where city.Population > 10000
select city;


• Ending a Query Expression:

A query expression must end with either a select clause or a group clause.
select: In a query expression, the select clause specifies the type of values that will be produced when the query is executed. The result is based on the evaluation of all the previous clauses and on any expressions in the select clause itself. A query expression must terminate with either a select clause or a group clause.

class SelectSample1
{
static void Main()
{
//Create the data source
List Scores = new List() { 97, 92, 81, 60 };

// Create the query.
IEnumerable queryHighScores =
from score in Scores
where score > 80
select score;

// Execute the query.
foreach (int i in queryHighScores)
{
Console.Write(i + " ");
}
}
}


group: The group clause returns a sequence of IGrouping<(Of <(TKey, TElement>)>) objects that contain zero or more items that match the key value for the group. For example, you can group a sequence of strings according to the first letter in each string. In this case, the first letter is the key and has a type char, and is stored in the Key property of each IGrouping<(Of <(TKey, TElement>)>) object. The compiler infers the type of the key.


// Query variable is an IEnumerable<igrouping<char,>>
var studentQuery1 =
from student in students
group student by student.Last[0];


• Filtering, Ordering, and Joining:

Between the starting from clause, and the ending select or group clause, all other clauses (where, join, orderby, from, let) are optional. Any of the optional clauses may be used zero times or multiple times in a query body.

where: The where clause is used in a query expression to specify which elements from the data source will be returned in the query expression. It applies a Boolean condition (predicate) to each source element (referenced by the range variable) and returns those for which the specified condition is true. A single query expression may contain multiple where clauses and a single clause may contain multiple predicate sub expressions.

orderby: In a query expression, the orderby clause causes the returned sequence or subsequence (group) to be sorted in either ascending or descending order. Multiple keys can be specified in order to perform one or more secondary sort operations. The sorting is performed by the default comparer for the type of the element. The default sort order is ascending. You can also specify a custom comparer. However, it is only available by using method-based syntax.

class OrderbySample1
{
static void Main()
{
// Create a delicious data source.
string[] fruits = { "cherry", "apple", "blueberry" };

// Query for ascending sort.
IEnumerable sortAscendingQuery =
from fruit in fruits
orderby fruit //"ascending" is default
select fruit;

// Query for descending sort.
IEnumerable sortDescendingQuery =
from w in fruits
orderby w descending
select w;

// Execute the query.
Console.WriteLine("Ascending:");
foreach (string s in sortAscendingQuery)
{
Console.WriteLine(s);
}

// Execute the query.
Console.WriteLine(Environment.NewLine + "Descending:");
foreach (string s in sortDescendingQuery)
{
Console.WriteLine(s);
}

// Keep the console window open in debug mode.
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
}
/* Output:
Ascending:
apple
blueberry
cherry

Descending:
cherry
blueberry
apple
*/


join: A join clause takes two source sequences as input. The elements in each sequence must either be or contain a property that can be compared to a corresponding property in the other sequence. The join clause compares the specified keys for equality by using the special equals keyword. All joins performed by the join clause are equijoins. The shape of the output of a join clause depends on the specific type of join you are performing.

• Continuations with "into":

The into contextual keyword can be used to create a temporary identifier to store the results of a group, join or select clause into a new identifier. This identifier can itself be a generator for additional query commands. When used in a group or select clause, the use of the new identifier is sometimes referred to as a continuation.

• Subqueries in a Query Expression:

A query clause may itself contain a query expression, which is sometimes referred to as a subquery. Each subquery starts with its own from clause that does not necessarily point to the same data source in the first from clause. For example, the following query shows a query expression that is used in the select statement to retrieve the results of a grouping operation.

var queryGroupMax =
from student in students
group student by student.GradeLevel into studentGroup
select new
{
Level = studentGroup.Key,
HighestScore =
(from student2 in studentGroup
select student2.Scores.Average())
.Max()
};

No comments:

Post a Comment