Aggregation en ;MongoDB
Introducción a Agregaciones en MongoDB
Fuente:
https://docs.mongodb.com/manual/aggregation/
Las operaciones de agregación procesan los registros de datos y devuelven los resultados calculados. Las operaciones de agregación agrupan los valores de varios documentos juntos y pueden realizar una variedad de operaciones en los datos agrupados para devolver un solo resultado. MongoDB proporciona tres formas de realizar la agregación: el canal de agregación, la función de reducción de mapas y los métodos de agregación de propósito único.
Aggregation Pipeline
Map-Reduce
Single Purpose Aggregation Operations
SQL vs Aggregation
Fuente MongoDB ejemplos
https://docs.mongodb.com/manual/reference/sql-aggregation-comparison/
Ejemplos:
SQL
Description
SELECT COUNT(*) AS count FROM orders
Count all records from orders
db.orders.aggregate( [
   {
     $group: {
        _id: null,
        count: { $sum: 1 }
     }
   }
] )SELECT SUM(price) AS total FROM orders
Sum the price field from orders
db.orders.aggregate( [
   {
     $group: {
        _id: null,
        total: { $sum: "$price" }
     }
   }
] )SELECT cust_id, SUM(price) AS total FROM orders GROUP BY cust_id
For each unique cust_id, sum the price field.
db.orders.aggregate( [
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   }
] )SELECT cust_id, SUM(price) AS total FROM orders GROUP BY cust_id ORDER BY total
For each unique cust_id, sum the price field, results sorted by sum.
db.orders.aggregate( [
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   },
   { $sort: { total: 1 } }
] )SELECT cust_id, ord_date, SUM(price) AS total FROM orders GROUP BY cust_id, ord_date
For each unique cust_id, ord_date grouping, sum the price field. Excludes the time portion of the date.
db.orders.aggregate( [
   {
     $group: {
        _id: {
           cust_id: "$cust_id",
           ord_date: {
               month: { $month: "$ord_date" },
               day: { $dayOfMonth: "$ord_date" },
               year: { $year: "$ord_date"}
           }
        },
        total: { $sum: "$price" }
     }
   }
] )SELECT cust_id, count(*) FROM orders GROUP BY cust_id HAVING count(*) > 1
For cust_id with multiple records, return the cust_id and the corresponding record count.
db.orders.aggregate( [
   {
     $group: {
        _id: "$cust_id",
        count: { $sum: 1 }
     }
   },
   { $match: { count: { $gt: 1 } } }
] )SELECT cust_id, ord_date,SUM(price) AS total FROM orders GROUP BY cust_id, ord_date HAVING total > 250
For each unique cust_id, ord_date grouping, sum the price field and return only where the sum is greater than 250. Excludes the time portion of the date.
db.orders.aggregate( [
   {
     $group: {
        _id: {
           cust_id: "$cust_id",
           ord_date: {
              month: { $month: "$ord_date" },
              day: { $dayOfMonth: "$ord_date" },
              year: { $year: "$ord_date"}
           }
        },
        total: { $sum: "$price" }
     }
   },
   { $match: { total: { $gt: 250 } } }
] )SELECT cust_id, SUM(price) as total FROM ordersWHERE status = 'A' GROUP BY cust_id
For each unique cust_id with status A, sum the price field.
db.orders.aggregate( [
   { $match: { status: 'A' } },
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   }
] )SELECT cust_id, SUM(price) as total FROM orders WHERE status = 'A' GROUP BY cust_id HAVING total > 250
For each unique cust_id with status A, sum the price field and return only where the sum is greater than 250
db.orders.aggregate( [
   { $match: { status: 'A' } },
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   },
   { $match: { total: { $gt: 250 } } }
] )SELECT cust_id, SUM(li.qty) as qty FROM orders o, order_lineitem li WHERE li.order_id = o.id GROUP BY cust_id
For each unique cust_id, sum the corresponding line item qty fields associated with the orders.
db.orders.aggregate( [
   { $unwind: "$items" },
   {
     $group: {
        _id: "$cust_id",
        qty: { $sum: "$items.qty" }
     }
   }
] )SELECT COUNT(*) FROM (SELECT cust_id, ord_date FROM orders GROUP BY cust_id, ord_date) as DerivedTable
Count the number of distinct cust_id, ord_date groupings. Excludes the time portion of the date.
db.orders.aggregate( [
   {
     $group: {
        _id: {
           cust_id: "$cust_id",
           ord_date: {
              month: { $month: "$ord_date" },
              day: { $dayOfMonth: "$ord_date" },
              year: { $year: "$ord_date"}
           }
        }
     }
   },
   {
     $group: {
        _id: null,
        count: { $sum: 1 }
     }
   }
] )SQL Example
Fuente:
MongoDB Aggregation Example
https://examples.javacodegeeks.com/software-development/mongodb/mongodb-aggregation-example/
$project
db.employee.aggregate([
 {"$project":{ "_id" : 0, "emp_fname" : 1, "emp_dept" : 1, "emp_band" : 1, "emp_specs" : 1 }}
  ])$match
db.employee.aggregate([ {"$match": { "emp_dept": "Finance" }} ])$limit
db.employee.aggregate([ {"$match": { "emp_dept": "Technology" }}, {"$limit": 1} ])$group
db.employee.aggregate([ {"$group": {"_id": {"emp_dept" : "$emp_dept"}, "No_of_Times": {"$sum": 1}}} ])$sort
db.employee.aggregate([ {"$match": { "emp_dept": "HR" }}, {"$sort" : {"emp_fname": 1}} ])$lookup
db.emp_city.aggregate([ {$lookup: {from: "employee", localField: "emp_country_id", foreignField: "_id", as: "Country" }} ])$out
db.employee.aggregate([ {"$match": { "emp_dept": "HR" }}, {"$sort" : {"emp_fname": 1}}, {"$out" : "sorted_emp"} ])$unwind
db.employee.aggregate([ {"$match": { "emp_fname": "April" }}, {"$unwind": "$emp_specs"} ])Expresiones usadas en la funciòn Aggregation
Expression
Description
$sum
Suma
$avg
Promedio
$min
Retorna el valor mínimo
$max
Retorna el valor máximo.
AGGREGATION BUILDER
Fuente:
http://mongodb.github.io/mongo-java-driver/3.2/builders/aggregation/
- Se pueden mezclar con Document(). 
Match
The$matchpipeline stage passes all documents matching the specified filter to the next stage. Though the filter can be an instance of any class that implementsBson, it’s convenient to combine with use of theFiltersclass. In the example below, it’s assumed that theeqmethod of theFiltersclass has been statically imported.
This example creates a pipeline stage that matches all documents where theauthorfield is equal to"Dave":
match(eq(
"author"
, 
"Dave"
))Project
The$projectpipeline stage passes the projected fields of all documents to the next stage. Though the projection can be an instance of any class that implementsBson, it’s convenient to combine with use of theProjectionsclass. In the example below, it’s assumed that theinclude,excludeId, andfieldsmethods of theProjectionsclass have been statically imported.
This example creates a pipeline stage that excludes the_idfield but includes thetitleandauthorfields:
project(fields(include(
"title"
, 
"author"
), excludeId()))Projecting Computed Fields
The$projectstage can project computed fields as well.
This example simply projects theqtyfield into a new field calledquantity. In other words, it renames the field:
project(computed(
"quantity"
, 
"$qty"
))Sample
The$samplepipeline stage randomly select N documents from its input. This example creates a pipeline stage that randomly selects 5 documents from the collection:
sample(
5
)Sort
The$sortpipeline stage passes all documents to the next stage, sorted by the specified sort criteria. Though the sort criteria can be an instance of any class that implementsBson, it’s convenient to combine with use of theSortsclass. In the example below, it’s assumed that thedescending,ascending, andorderBymethods of theSortsclass have been statically imported.
This example creates a pipeline stage that sorts in descending order according to the value of theagefield and then in ascending order according to the value of thepostsfield:
sort(orderBy(descending(
"age"
), ascending(
"posts"
)))Skip
The$skippipeline stage skips over the specified number of documents that pass into the stage and passes the remaining documents to the next stage.
This example skips the first5documents:
skip(
5
)Limit
The$limitpipeline stage limits the number of documents passed to the next stage.
This example limits the number of documents to10:
limit(
10
)Lookup
Starting in 3.2, MongoDB provides a new$lookuppipeline stage that performs a left outer join with another collection to filter in documents from the joined collection for processing.
This example performs a left outer join on thefromCollectioncollection, joining thelocalfield to thefromfield and outputted in thejoinedOutputfield:
lookup(
"fromCollection"
, 
"local"
, 
"from"
, 
"joinedOutput"
)Group
The$grouppipeline stage groups documents by some specified expression and outputs to the next stage a document for each distinct grouping. A group consists of an_idwhich specifies the expression on which to group, and zero or moreaccumulatorswhich are evaluated for each grouping. To simplify the expression of accumulators, the driver includes anAccumulatorsclass with static factory methods for each of the supported accumulators. In the example below, it’s assumed that thesumandavgmethods of theAccumulatorsclass have been statically imported.
This example groups documents by the value of thecustomerIdfield, and for each group accumulates the sum and average of the values of thequantityfield into thetotalQuantityandaverageQuantityfields, respectively.
group(
"$customerId"
, sum(
"totalQuantity"
, 
"$quantity"
), avg(
"averageQuantity"
, 
"$quantity"
))Unwind
The$unwindpipeline stage deconstructs an array field from the input documents to output a document for each element.
This example outputs, for each document, a document for each element in thesizesarray:
unwind(
"$sizes"
)Available with MongoDB 3.2, this example also includes any documents that have missing ornullvalues for the$sizesfield or where the$sizeslist is empty:
unwind(
"$sizes"
, 
new
 UnwindOptions().preserveNullAndEmptyArrays(
true
))Available with MongoDB 3.2, this example unwinds thesizesarray and also outputs the array index into the$positionfield:
unwind(
"$sizes"
, 
new
 UnwindOptions().includeArrayIndex(
"$position"
))Out
The$outpipeline stage outputs all documents to the specified collection. It must be the last stage in any aggregate pipeline:
This example writes the pipeline to theauthorscollection:
out(
"authors"
)Creating a Pipeline
The above pipeline operators are typically combined into a list and passed to theaggregatemethod of aMongoCollection. For instance:
collection.aggregate(Arrays.asList(match(eq(
"author"
, 
"Dave"
)),
                                   group(
"$customerId"
, sum(
"totalQuantity"
, 
"$quantity"
),
                                                        avg(
"averageQuantity"
, 
"$quantity"
))
                                   out(
"authors"
)));Last updated
Was this helpful?