LINQ Expressions
LINQ, Language-Integrated Query, represents a set of technologies rooted in the integration of query capabilities in C# (also Visual Basic and other .NET languages).
LINQ allows queries to be “firstclass” language constructs similar to classes, events, methods, and etc.
The most visible aspect of this integration is the query expression. Expressions conform to a declarative syntax. This syntax allows for rich querying including filtering, grouping, and ordering operations with minimal code. The same basic expression patterns are utilized to query and manipulate data whether SQL databases, XML documents and streams, ADO.NET datasets, or .NET collections.
The example below demonstrates the entire query operation process, which consists of spawning a data source, defining the expression, and executing the query:
class LINQExpr { static void Main() { // Data source int[] points = new int[] { 93, 82, 80, 50 }; // Query IEnumerable<int> pointsQuery = from point in points where point > 80 select point; // Execute query foreach (int i in pointsQuery) { Console.Write(i + " "); } } }
C#/LINQ queries retrieve from a data source, and the application sees the source as an IEnumberable<T> or IQueryable <T> collection regardless of its type. Queries all perform one of three possible tasks:
- It retrieves an element subset for producing a new sequence without changing individual elements.
- It retrieves a sequence of elements, and converts them into a new type of object.
- It retrieves a singleton value describing the source data.
Query expressions are first-class constructs; thus, they can be used in any context like any other C# members. LINQ expressions conform to a declarative syntax resembling SQL or Xquery. They consist of a set of clauses, and each clause contains one or more C# expressions; which may be query expressions or contain them.
Query expressions must start with a from clause and end with a select or group clause. It can contain one or more optional clauses between its start and end. These optional clauses include where, orderby, join, let, and additional from clauses. The into keyword allows results of group or join to source additional query clauses within the expression.
In LINQ, query variables store queries, not results. These variables are always of enumerable type, which produces a sequence of elements when iterated through using a foreach statement, or a direct call to its Ienumerator.MoveNext method. Review an example of a query variable (or simply a query):
static void Main() { // Data source int[] points = { 110, 105, 81, 88, 120, 94 }; // Query IEnumerable<int> pointQuery = //query variable from point in points where point > 100 orderby point descending select point; // Execute the query foreach (int gamePoints in pointQuery) { Console.WriteLine(gamePoints); } }
The following example shows variables initialized with queries, which are not query variables:
int hiPoints =
(from point in points
select point)
.Max();
//Another option splits the expression
IEnumerable<int> pointQuery =
from point in points
select point;
int hiPoints = pointQuery.Max();
When in doubt, use the var keyword for query variables. This instructs the compiler to infer its type.
CLAUSES
The from clause begins a query expression. It specifies the data source of the query. Range variables represent successive elements of the source sequence. Elements in the data source dictate whether range variables are strongly-typed. Strongly-typed range variables allow for the use of the dot operator to access members. Range variables remain in scope until the query completes with a semicolon or continuation clause. From clauses can contain multiple from clauses. Collection sources or sources containing them typically require multiple from clauses. Review an example of a from clause below:
IEnumerable<Nation> nationAreaQuery =
from nation in nations
where nation.Area > 400000 //sq km
select nation;
Review an example of multiple from clauses:
IEnumerable<City> metroQuery = from nation in nations from metro in nation.Cities where metro.Population > 900000 select metro;
Group clauses serve as one method for ending query expressions. The group clause produces a sequence of groups organized by a specified key. Any data type can be used for keys. The group clause example below uses a char value for the key:
var queryNationGroups = from nation in nations group nation by nation.Name[0];
The select clause produces all other sequence types. A basic select statement only produces a sequence of objects identical in type to objects within the source. In the example below, the orderby clause sorts elements, and the select clause produces a sequence of these sorted objects:
IEnumerable<Nation> sortQuery = from nation in nations orderby nation.Area select nation;
Other select statements can transform data into sequences of new types in a process known as projection. Review a projection example below:
// Use var in queries producing anonymous types
var NamePOPQuery =
from nation in nations
select new { Name = country.Name, Pop = country.Population };
Use the into keyword in select or group clauses to spawn a temporary identifier which stores a query. This continuation statement serves as a means of performing additional operations on a query after grouping or select operations. Review an example below:
// percentileQuery is IEnumerable<IGrouping<int, Nation>>
var percentileQuery =
from nation in nations
let percentile = (int) nation.Population / 4000000
group nation by percentile into nationGroup
where nationGroup.Key >= 20
orderby nationGroup.Key
select nationGroup;
// grouping is IGrouping<int, Nation>
foreach (var grouping in percentileQuery)
{
Console.WriteLine(grouping.Key);
foreach (var nation in grouping)
Console.WriteLine(nation.Name + ":" + nation.Population);
}
The where clause filters elements from the source data. Review an example below:
IEnumerable<City> MetroPopQuery = from metro in cities where metro.Population < 10000000 && metro.Population > 700000 select metro;
The orderby clause sorts results in descending or ascending order, with ascending as its default in the absence of a keyword. It also allows for secondary sorting. Review an example below which performs a primary sort (area) followed by a secondary (population):
IEnumerable<Nation> NationSortquery = from nation in nations orderby nation.Area, nation.Population descending select nation;
The join clause associates and/or combines elements from multiple data sources based on equality comparison of keys. LINQ join operations manage object sequences with elements of different types. After joining, a select or group statement must specify the elements stored in the output sequence. Anonymous types can be used in these operations. Review an example of a join statement below:
var makeQuery = from make in makes join model in models on make equals model.Make select new { Make = make, Name = model.Name };
The let clause stores results of expressions. Review an example below:
string[] names = { "Qi Liu", "Ami Aidoo", "Priya Chopra", "Yoko Tanaka" }; IEnumerable<string> queryGivenNames = from name in names let givenName = name.Split(new char[] { ' ' })[0] select givenName; foreach (string a in queryGivenNames) Console.Write(a + " ");
SUBQUERY
Query clauses can contain query expressions known as subqueries. These subqueries begin with from clauses, and may or may not point to the same data source. Review an example below:
var queryTeamMax = from worker in workers group worker by worker.Department into workerGroup select new { Department = workerGroup.Key, MaxOutput = (from worker2 in workerGroup select worker2.Quantity.Average()) .Max() };
QUERY OPTIONS
There are three ways to write a LINQ query in C#:
- Standard query syntax
- Method syntax
- Combination syntax (mixing standard with method)
The first way is the recommended method. Review an example of its use below:
IEnumerable<int> numberQuery = from number in numbers where number % 5 == 0 orderby number select number;
Some queries require a method call such as methods returning singleton values (e.g., maximum, minimum, or average). These methods must be called last because they only manage a single value and do not function as a source for other operations. Review a query method below:
IEnumerable<int> numberQuery = numbers.Where(number => number % 5 == 0).OrderBy(n => n);