DevToolBox無料
ブログ

MongoDBガイド:開発者のための完全NoSQLデータベースチュートリアル

14分by DevToolBox
TL;DR

MongoDB stores data as flexible BSON documents. Use insertOne/find/updateOne/deleteOne for CRUD, the aggregation pipeline ($match, $group, $lookup) for analytics, compound indexes following the ESR rule for performance, and embed-vs-reference decisions for schema design. Multi-document transactions provide ACID guarantees since v4.0. Use explain() and the database profiler to optimize queries.

1. The Document Model and BSON Types

MongoDB stores data as documents — flexible, JSON-like structures that map naturally to objects in most programming languages. Unlike relational databases where you define rigid table schemas upfront, MongoDB documents in the same collection can have different fields. Internally documents use BSON (Binary JSON), extending JSON with types like Date, ObjectId,Decimal128, and BinData. Each document has a maximum size of 16 MB.

A collection is analogous to a table, and a database holds multiple collections. The _id field is required and automatically generated as an ObjectId if not provided. ObjectId is a 12-byte value containing a 4-byte timestamp, 5-byte random value, and 3-byte counter — making it sortable by creation time.

// A MongoDB document — JSON-like but with rich types
{
  _id: ObjectId("507f1f77bcf86cd799439011"),  // 12-byte unique ID
  name: "Alice Johnson",                       // String (UTF-8)
  age: 29,                                     // Int32
  balance: NumberDecimal("1249.99"),            // Decimal128
  tags: ["developer", "speaker"],               // Array
  address: {                                    // Embedded document
    city: "San Francisco",
    state: "CA",
    zip: "94102"
  },
  createdAt: ISODate("2026-01-15T09:00:00Z"),  // UTC datetime
  metadata: BinData(0, "c3VyZQ==")             // Binary data
}
BSON TypeExampleUse Case
ObjectIdObjectId("507f...")Default _id, embeds timestamp
String"hello"UTF-8 text
Int32 / Int6442 / NumberLong(42)Integer values
Decimal128NumberDecimal("9.99")Financial / exact decimal
DateISODate("2026-01-15")Timestamps
Array[1, 2, 3]Lists, multikey indexes
Booleantrue / falseFlags

2. CRUD Operations

Every data interaction maps to Create, Read, Update, or Delete. All operations accept a filter document — a JSON-like object describing which documents to target. MongoDB provides both single-document and multi-document variants of each operation. Single-document operations are atomic; multi-document operations process each document independently unless wrapped in a transaction.

Create — insertOne & insertMany

// Insert a single document — returns insertedId
db.users.insertOne({
  name: "Alice",
  email: "alice@example.com",
  role: "admin",
  createdAt: new Date()
});

// Insert multiple documents
// ordered: false = continue inserting on duplicate key errors
db.users.insertMany([
  { name: "Bob", email: "bob@example.com", role: "user" },
  { name: "Carol", email: "carol@example.com", role: "editor" }
], { ordered: false });

Read — find & findOne

// Find with query operators, projection, sort, limit
db.users.find({
  role: { $in: ["admin", "editor"] },
  createdAt: { $gte: ISODate("2026-01-01") }
}).project({ name: 1, email: 1, _id: 0 })
  .sort({ createdAt: -1 }).limit(20);

// Find one document by exact match
db.users.findOne({ email: "alice@example.com" });

// Nested field query
db.users.find({ "address.city": "San Francisco" });

// Array query with $elemMatch
db.orders.find({ items: { $elemMatch: { qty: { $gt: 5 }, price: { $lt: 20 } } } });

// Query operators cheat sheet:
// $eq $ne $gt $gte $lt $lte     — comparison
// $in $nin                       — set membership
// $and $or $not $nor             — logical
// $exists $type                  — element
// $regex $text                   — string matching
// $elemMatch $size $all          — array

Update — updateOne & updateMany

// Update with multiple operators
db.users.updateOne(
  { email: "alice@example.com" },
  {
    $set: { role: "superadmin" },
    $inc: { loginCount: 1 },
    $push: { tags: "verified" },
    $currentDate: { lastModified: true }
  }
);

// Upsert — insert if not found
db.metrics.updateOne(
  { date: "2026-02-28", page: "/home" },
  { $inc: { views: 1 } },
  { upsert: true }
);

// Array update operators
db.posts.updateOne(
  { _id: postId },
  { $addToSet: { likes: userId },      // add only if not present
    $pull: { dislikes: userId } }       // remove from array
);

// Update operators:
// $set, $unset, $rename         — field
// $inc, $mul, $min, $max        — numeric
// $push, $pull, $addToSet, $pop — array

Delete — deleteOne & deleteMany

db.users.deleteOne({ email: "bob@example.com" });

db.sessions.deleteMany({
  lastAccess: { $lt: ISODate("2025-01-01") }
});

// findOneAndDelete — returns the deleted document
const deleted = db.queue.findOneAndDelete(
  { status: "pending" },
  { sort: { priority: -1 } }
);

3. Aggregation Pipeline

The aggregation pipeline processes documents through sequential stages. Each stage transforms documents as they pass through — like a Unix pipe. Place $match early to filter documents before expensive stages, and use $project to reduce document size mid-pipeline. The pipeline can use indexes for $match and $sort stages at the beginning.

// Revenue report by category — last 30 days
db.orders.aggregate([
  { $match: {
    status: "completed",
    orderDate: { $gte: ISODate("2026-01-29") }
  }},
  { $lookup: {
    from: "products", localField: "productId",
    foreignField: "_id", as: "product"
  }},
  { $unwind: "$product" },
  { $group: {
    _id: "$product.category",
    totalRevenue: { $sum: "$amount" },
    avgOrder: { $avg: "$amount" },
    count: { $sum: 1 }
  }},
  { $project: {
    category: "$_id", totalRevenue: { $round: ["$totalRevenue", 2] },
    avgOrder: { $round: ["$avgOrder", 2] }, count: 1, _id: 0
  }},
  { $sort: { totalRevenue: -1 } }
]);
StagePurposeSQL Equivalent
$matchFilter documentsWHERE
$groupAggregate valuesGROUP BY
$projectReshape fieldsSELECT
$lookupLeft outer joinLEFT JOIN
$unwindFlatten array to one doc per elementUNNEST
$sortOrder resultsORDER BY
$facetMultiple pipelines in parallelMultiple queries
$addFieldsAdd computed fieldsSELECT expr AS alias
$bucketGroup into value rangesCASE WHEN + GROUP BY

$facet — Multiple Aggregations in One Query

// Get both paginated results and total count in one query
db.products.aggregate([
  { $match: { category: "electronics" } },
  { $facet: {
    results: [
      { $sort: { price: -1 } },
      { $skip: 20 },
      { $limit: 10 },
      { $project: { name: 1, price: 1 } }
    ],
    totalCount: [
      { $count: "count" }
    ],
    priceRange: [
      { $group: {
        _id: null,
        minPrice: { $min: "$price" },
        maxPrice: { $max: "$price" }
      }}
    ]
  }}
]);

4. Indexing Strategies

Indexes are the most critical factor for query performance. Without an index, every query performs a collection scan (COLLSCAN) — reading every document. Follow the ESR rule: Equality fields first, Sort fields next, Range fields last. Usedb.collection.getIndexes() to list existing indexes and db.collection.totalIndexSize() to check disk usage. Over-indexing wastes storage and slows writes, so create only the indexes your queries need.

Compound and Single-Field Indexes

// Single-field index with unique constraint
db.users.createIndex({ email: 1 }, { unique: true });

// Compound index — ESR rule
db.users.createIndex({ status: 1, name: 1, createdAt: 1 });

// Partial index — smaller and faster
db.users.createIndex(
  { email: 1 },
  { partialFilterExpression: { status: "active" } }
);

// Covered query — index has all fields, no doc fetch
db.users.createIndex({ email: 1, name: 1 });
db.users.find({ email: "alice@example.com" }, { name: 1, email: 1, _id: 0 });

Specialized Index Types

// Text index — full-text search
db.articles.createIndex(
  { title: "text", body: "text" },
  { weights: { title: 10, body: 1 } }
);
db.articles.find({ $text: { $search: "mongodb aggregation" } });

// TTL index — auto-expire documents after 7 days
db.sessions.createIndex({ createdAt: 1 }, { expireAfterSeconds: 604800 });

// 2dsphere geospatial index
db.places.createIndex({ location: "2dsphere" });
db.places.find({
  location: { $near: {
    $geometry: { type: "Point", coordinates: [-122.4, 37.8] },
    $maxDistance: 5000
  }}
});

// Wildcard index — flexible schema
db.events.createIndex({ "metadata.$**": 1 });
Index TypeBest ForLimitation
Single-fieldSimple equality / rangeOne field only
CompoundMulti-field queries (ESR)Max 32 fields; order matters
MultikeyArray fieldsMax 1 array per compound index
TextFull-text searchOne text index per collection
2dsphereGeospatial queriesGeoJSON format required
TTLAuto-expire documentsSingle Date field only
HashedHash-based shardingEquality only, no range
WildcardDynamic schemasNo compound queries

5. Schema Design Patterns

MongoDB schema design is driven by how your application queries data, not by normalization rules. Unlike relational databases where you normalize to third normal form and join at query time, MongoDB encourages denormalization to reduce the need for joins. The key decision is whether to embed related data inside a document or reference it via an ObjectId in another collection. Consider your read/write ratio, data access patterns, and the expected document size (max 16 MB).

CriteriaEmbedReference
RelationshipOne-to-few (1-5)One-to-many, many-to-many
Read patternAlways read togetherRead independently
Update patternRarely updated aloneUpdated independently
Data sizeSmall sub-documentLarge or growing unbounded
Doc limitUnder 16 MB totalCould exceed 16 MB
// EMBEDDED: Blog post with comments (one-to-few)
{
  _id: ObjectId("..."), title: "Schema Design",
  comments: [
    { user: "Bob", text: "Great!", date: ISODate("2026-02-01") },
    { user: "Carol", text: "Helpful", date: ISODate("2026-02-02") }
  ]
}

// REFERENCED: Orders referencing products (one-to-many)
{
  _id: ObjectId("..."), userId: ObjectId("..."),
  items: [
    { productId: ObjectId("..."), qty: 2, price: 29.99 },
    { productId: ObjectId("..."), qty: 1, price: 49.99 }
  ], total: 109.97
}

Common Patterns: Bucket, Polymorphic, Computed

// BUCKET — time-series data batched into hourly/daily buckets
{
  sensorId: "temp-001", date: ISODate("2026-02-28"),
  count: 60,
  measurements: [
    { ts: ISODate("...T00:00:00Z"), value: 22.1 },
    { ts: ISODate("...T00:01:00Z"), value: 22.3 }
  ],
  summary: { min: 21.5, max: 23.1, avg: 22.2 }
}

// POLYMORPHIC — different shapes in same collection
{ type: "car", make: "Toyota", doors: 4, mpg: 32 }
{ type: "truck", make: "Ford", payload: 2000, axles: 2 }

// COMPUTED — pre-calculate expensive aggregations
{
  _id: "product-42", name: "Widget Pro",
  reviewStats: { count: 287, avgRating: 4.3,
    distribution: { 5: 142, 4: 89, 3: 31, 2: 15, 1: 10 } }
}

6. Multi-Document Transactions

MongoDB provides ACID transactions across multiple documents and collections since version 4.0 (replica sets) and 4.2 (sharded clusters). While single-document operations are already atomic, transactions let you coordinate writes across multiple documents with all-or-nothing semantics — just like in a relational database. Keep transactions short (under 60 seconds by default) and avoid them when single-document atomicity suffices, as they add latency and lock overhead.

const session = client.startSession();
try {
  session.startTransaction({
    readConcern: { level: "snapshot" },
    writeConcern: { w: "majority" }
  });
  const accounts = client.db("bank").collection("accounts");

  // Debit source
  const debit = await accounts.updateOne(
    { _id: "acct-001", balance: { $gte: 500 } },
    { $inc: { balance: -500 } }, { session }
  );
  if (debit.modifiedCount === 0) throw new Error("Insufficient funds");

  // Credit destination
  await accounts.updateOne(
    { _id: "acct-002" },
    { $inc: { balance: 500 } }, { session }
  );

  // Record in ledger
  await client.db("bank").collection("ledger").insertOne(
    { from: "acct-001", to: "acct-002", amount: 500, date: new Date() },
    { session }
  );

  await session.commitTransaction();
} catch (err) {
  await session.abortTransaction();
} finally {
  session.endSession();
}

7. Change Streams

Change streams subscribe to real-time data changes — inserts, updates, deletes, and replacements — without polling. Built on the oplog, they work with replica sets and sharded clusters. Use them for real-time notifications, cache invalidation, data synchronization, and event-driven microservices. Each change event includes a resume token so you can restart the stream from where you left off after a disconnect.

const pipeline = [
  { $match: { "fullDocument.status": "urgent",
    operationType: { $in: ["insert", "update"] } }}
];

const stream = db.collection("tickets").watch(pipeline, {
  fullDocument: "updateLookup"
});

stream.on("change", (event) => {
  console.log("Op:", event.operationType);
  console.log("Doc:", event.fullDocument);
  // Trigger notification, update cache, etc.
});

// Resume after disconnect
const token = loadTokenFromStorage();
const resumed = collection.watch(pipeline, { resumeAfter: token });

8. Mongoose ODM Basics

Mongoose is the most popular MongoDB ODM for Node.js, providing schema validation, type casting, middleware hooks, virtual properties, and a fluent query builder on top of the native driver. It enforces structure at the application level while MongoDB itself stays schema-flexible. Use .lean() on queries when you only need plain objects (skips Mongoose hydration for better performance).

import mongoose from "mongoose";

await mongoose.connect("mongodb://localhost:27017/myapp", {
  maxPoolSize: 10, serverSelectionTimeoutMS: 5000
});

const userSchema = new mongoose.Schema({
  name:  { type: String, required: true, trim: true },
  email: { type: String, required: true, unique: true, lowercase: true },
  age:   { type: Number, min: 0, max: 150 },
  role:  { type: String, enum: ["user", "admin", "editor"], default: "user" },
  tags:  [String],
  profile: { bio: String, avatar: String }
}, { timestamps: true });

// Virtual property
userSchema.virtual("isAdmin").get(function() {
  return this.role === "admin";
});

// Pre-save middleware
userSchema.pre("save", function(next) {
  if (this.isModified("email")) { /* validate */ }
  next();
});

const User = mongoose.model("User", userSchema);

// CRUD with Mongoose
const user = await User.create({ name: "Alice", email: "alice@dev.com" });
const admins = await User.find({ role: "admin" }).sort({ name: 1 }).lean();
await User.findByIdAndUpdate(user._id, { $push: { tags: "verified" } });

// Population — resolve references
const postSchema = new mongoose.Schema({
  title: String,
  author: { type: mongoose.Schema.Types.ObjectId, ref: "User" }
});
const Post = mongoose.model("Post", postSchema);
const posts = await Post.find().populate("author", "name email").lean();

9. Replica Sets and Sharding

Replica sets provide high availability with automatic failover. A typical set has three members: one primary (accepts writes) and two secondaries (replicate data asynchronously). If the primary fails, an election promotes a secondary within seconds. Sharding distributes data across multiple replica sets using a shard key. Choose your shard key carefully — it determines data distribution and query routing, and cannot easily be changed after sharding.

Replica Set Setup

// Connection string
"mongodb://mongo1:27017,mongo2:27017,mongo3:27017/mydb?replicaSet=rs0"

// Initiate replica set
rs.initiate({
  _id: "rs0",
  members: [
    { _id: 0, host: "mongo1:27017", priority: 2 },
    { _id: 1, host: "mongo2:27017", priority: 1 },
    { _id: 2, host: "mongo3:27017", priority: 1 }
  ]
});
db.getMongo().setReadPref("secondaryPreferred");

Sharding

// Enable sharding on a database
sh.enableSharding("analytics");

// Shard a collection — choose shard key carefully!
sh.shardCollection("analytics.events", {
  tenantId: 1, timestamp: 1
});

// Shard key strategies:
// Ranged:   { timestamp: 1 }        — good for range scans, hot shard risk
// Hashed:   { userId: "hashed" }    — even distribution, no range queries
// Compound: { region: 1, _id: 1 }   — zone-based + high cardinality

// Good shard key properties:
// - High cardinality (many unique values)
// - Even distribution (no hot spots)
// - Query isolation (most queries target a single shard)
// - Cannot be changed after sharding!
FeatureReplica SetSharded Cluster
PurposeHA, read scalingHorizontal write/storage scaling
DataIdentical copiesPartitioned across shards
FailoverAutomatic electionPer-shard automatic
When to useAlways (production baseline)Data exceeds single server
ComponentsPrimary + secondariesShards + config servers + mongos

10. Performance Optimization

Performance tuning in MongoDB starts with understanding your query patterns. Use explain() to analyze query execution plans, the database profiler to discover slow queries, and connection pooling to manage driver resources efficiently. The most common performance issue is missing indexes — a single missing index can turn a 2ms query into a multi-second collection scan. Always check explain() output for COLLSCAN stages and high totalDocsExamined relative to nReturned.

explain() and Database Profiler

// Analyze query execution plan
db.orders.find({ status: "pending" }).explain("executionStats");

// Key fields in explain output:
// executionStats.nReturned         — docs returned to client
// executionStats.totalDocsExamined — docs scanned (want close to nReturned)
// executionStats.totalKeysExamined — index keys scanned
// executionStats.executionTimeMillis — total query time
// winningPlan.stage               — IXSCAN (good) vs COLLSCAN (bad)
// winningPlan.inputStage.indexName — which index was chosen

// Enable database profiler for slow queries (> 100ms)
db.setProfilingLevel(1, { slowms: 100 });

// Query the profiler output
db.system.profile.find({
  millis: { $gt: 200 },
  ns: "mydb.orders"
}).sort({ ts: -1 }).limit(5);

// Disable profiler when done
db.setProfilingLevel(0);

Connection Pooling and Driver Configuration

const { MongoClient } = require("mongodb");

const client = new MongoClient(uri, {
  maxPoolSize: 50,              // max concurrent connections
  minPoolSize: 5,               // keep warm connections
  maxIdleTimeMS: 60000,         // close idle after 60s
  serverSelectionTimeoutMS: 5000,
  socketTimeoutMS: 45000,
  compressors: ["zstd", "snappy"],  // network compression
  readPreference: "secondaryPreferred",
  readConcern: { level: "majority" },
  writeConcern: { w: "majority", j: true }
});

// IMPORTANT: Create client ONCE and reuse across requests
// Do NOT create a new MongoClient per request!

Performance Best Practices Summary

PracticeWhy
Create compound indexes (ESR)Avoid COLLSCAN, support queries efficiently
Project only needed fieldsReduce network transfer and memory
Use covered queriesAvoid document fetch entirely
Reuse MongoClientConnection pooling, avoid TCP overhead
Enable compression (zstd)30-70% less network bandwidth
Use readPreference secondaryDistribute read load
Avoid unbounded arraysPrevent 16MB limit and slow updates
Monitor with profilerIdentify slow queries proactively

11. Atlas Features and MongoDB Compass

MongoDB Atlas is the fully managed cloud database service available on AWS, GCP, and Azure. It handles provisioning, patching, automated backups with point-in-time recovery, and auto-scaling. The free M0 tier (512 MB) is ideal for development and learning. MongoDB Compass is the official desktop GUI for visualizing schemas, building aggregation pipelines, managing indexes, and analyzing query performance with visual explain plans.

// Atlas connection (SRV format — auto-discovers replica set)
"mongodb+srv://user:pass@cluster0.abc123.mongodb.net/mydb"

// Key Atlas features:
// - Automated backups with point-in-time recovery
// - Auto-scaling storage and compute
// - Global clusters with zone-based sharding
// - Performance Advisor (automatic index recommendations)
// - Atlas Charts (built-in data visualization)
// - Atlas Data Federation (query S3 + Atlas together)

// Atlas Search — Lucene-based full-text via aggregation
db.products.aggregate([
  { $search: {
    index: "product_search",
    compound: {
      must: [{ text: { query: "wireless headphones", path: "title" } }],
      filter: [{ range: { path: "price", lte: 100 } }]
    }
  }},
  { $limit: 10 },
  { $project: { title: 1, price: 1, score: { $meta: "searchScore" } } }
]);

// MongoDB Compass features:
// - Visual schema analyzer
// - Aggregation pipeline builder (drag-and-drop stages)
// - Visual explain plan for query optimization
// - Index management and performance metrics
// - CRUD operations with a graphical interface

12. Security: Authentication, Authorization & Encryption

Production MongoDB deployments require multiple security layers: authentication (who can connect), authorization (what they can do), encryption (protecting data in transit and at rest), and network controls (who can reach the server). Always enable authentication — by default, MongoDB allows unauthenticated connections which is a common source of data breaches.

// Create an admin user
use admin
db.createUser({
  user: "appAdmin", pwd: passwordPrompt(),
  roles: [
    { role: "readWrite", db: "myapp" },
    { role: "dbAdmin", db: "myapp" }
  ]
});

// Read-only user for reporting
db.createUser({
  user: "reporter", pwd: passwordPrompt(),
  roles: [{ role: "read", db: "myapp" }]
});

// Built-in roles:
// read, readWrite         — data access
// dbAdmin, dbOwner        — database admin
// clusterAdmin            — cluster ops
// userAdminAnyDatabase    — user management

// TLS/SSL connection:
// mongod --tlsMode requireTLS \
//   --tlsCertificateKeyFile /etc/ssl/mongodb.pem \
//   --tlsCAFile /etc/ssl/ca.pem

// Encryption at rest (Atlas or Enterprise):
// Atlas: AES-256 enabled by default
// Enterprise: KMIP or local key provider

// Client-Side Field Level Encryption (CSFLE)
// Encrypts fields BEFORE they leave the driver
// Even DBAs cannot read encrypted fields
Security LayerMechanismProtects Against
AuthenticationSCRAM-SHA-256, x.509, LDAPUnauthorized connections
AuthorizationRole-based access controlPrivilege escalation
Encryption in transitTLS/SSLNetwork eavesdropping
Encryption at restAES-256, KMIPDisk theft
Field-level encryptionCSFLE / Queryable EncryptionAdmin seeing sensitive data
NetworkIP whitelist, VPC peeringExternal attacks
Key Takeaways
  • MongoDB stores documents as BSON with rich types: ObjectId, Date, Decimal128, embedded docs, arrays
  • CRUD uses insertOne, find, updateOne, deleteOne with operators like $gt, $in, $regex
  • The aggregation pipeline chains stages ($match, $group, $lookup, $unwind) for analytics and joins
  • Follow the ESR rule for compound indexes: Equality, Sort, Range field order
  • Embed for one-to-few read-together data; reference for one-to-many or independent access
  • Multi-document transactions provide ACID — but prefer single-document atomicity when possible
  • Change streams enable real-time event-driven architectures without polling
  • Use explain("executionStats") to verify index usage and avoid COLLSCAN
  • Replica sets for HA (production baseline); add sharding when data exceeds single-server capacity
  • Always enable authentication, use TLS, apply RBAC, and consider CSFLE for sensitive fields
𝕏 Twitterin LinkedIn
この記事は役に立ちましたか?

最新情報を受け取る

毎週の開発ヒントと新ツール情報。

スパムなし。いつでも解除可能。

Try These Related Tools

{ }JSON FormatterB→Base64 Encoder#Hash Generator

Related Articles

PostgreSQL完全ガイド:SQL、インデックス、JSONB、パフォーマンス

PostgreSQLをマスター。コアSQL、インデックス、Node.js pg、Prisma ORM、Python asyncpg、JSONB、全文検索、ウィンドウ関数、パフォーマンスチューニングの完全ガイド。

Redisガイド:キャッシュ、Pub/Sub、ストリーム、本番パターン完全ガイド

Redisをマスター。データ型、ioredis、キャッシュパターン、セッションストレージ、Pub/Sub、ストリーム、Python redis-py、レート制限、トランザクション、本番設定の完全ガイド。

Node.jsガイド:バックエンド開発完全チュートリアル

Node.jsバックエンド開発をマスター。イベントループ、Express.js、REST API、JWT認証、DB統合、Jestテスト、PM2デプロイ、Node.js vs Deno vs Bun比較の完全ガイド。