go/packages: revise documentation

I made a pass through the documentation trying to simplify
and make it more accessible to readers who are unfamiliar
with all the ins and outs of the implementation.

I also added TODOs about things I don't understand
and names that we should think about changing.

Change-Id: I633316bd407f3360eb8a683bc8d85fba93ca381e
Reviewed-on: https://go-review.googlesource.com/125305
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
This commit is contained in:
Russ Cox 2018-07-20 16:35:14 -04:00
parent 60ffea201e
commit 96ee42108a
3 changed files with 238 additions and 265 deletions

View File

@ -3,116 +3,89 @@
// license that can be found in the LICENSE file.
/*
Package packages loads Go packages for inspection and analysis.
Package packages provides information about Go packages,
such as their path, source files, and imports.
It can optionally load, parse, and type-check the source files of a
package, and obtain type information for their dependencies either by
loading export data files produced by the Go compiler or by
recursively loading dependencies from source code.
NOTE: THIS PACKAGE IS NOT YET READY FOR WIDESPREAD USE:
- The interface is still being reivsed and is likely to change.
- The implementation depends on the Go 1.11 go command.
- We intend to finalize the API before Go 1.11 is released.
THIS INTERFACE IS EXPERIMENTAL AND IS LIKELY TO CHANGE.
The three loaders Metadata, TypeCheck, and WholeProgram provide differing
amounts of detail about the loaded packages but otherwise behave the same.
All three take as input a list of patterns and return a list of Package structs
describing individual packages matched by those patterns.
This package currently requires a go1.11 version of go list;
its functions will return a GoTooOldError for older toolchains.
The patterns are used as arguments to the underlying build tool,
such as the go command or Bazel, and are interpreted according to
that tool's conventions.
This package is intended to replace golang.org/x/tools/go/loader.
It provides a simpler interface to the same functionality and serves
as a foundation for analysis tools that work with 'go build',
including its support for versioned packages,
and also with alternative build systems such as Bazel and Blaze.
The Package struct provides basic information about the package, including
Its primary operation is to load packages through
the Metadata, TypeCheck, and WholeProgram functions,
which accept a list of string arguments that denote
one or more packages according to the conventions
of the underlying build system.
- ID, a unique identifier for the package in the returned set;
- PkgPath, the import path for the package when used in a build;
- Srcs, the names of the package's Go source files;
- Imports, a map from source import strings to the Packages they name;
- Type, the type information for the package's exported symbols;
- Files, the parsed syntax trees for the package's source code; and
- Info, the result of a complete type-check of the package syntax trees.
For example, in a 'go build' workspace,
they may be a list of package names,
or relative directory names,
or even an ad-hoc list of source files:
(See the documentation for type Package for the complete list of fields
and more detailed descriptions.)
fmt
encoding/json
./json
a.go b.go
For example,
For a Bazel project, the arguments use Bazel's package notation:
Metadata(nil, "bytes", "unicode...")
@repo//project:target
//project:target
:target
target
returns four Package structs describing the standard library packages
bytes, unicode, unicode/utf16, and unicode/utf8. Note that one pattern
can match multiple packages and that a package might be matched by
multiple patterns: in general it is not possible to determine which
packages correspond to which patterns.
An application that loads packages can thus pass its command-line
arguments directly to the loading functions and it will integrate with the
usual conventions for that project.
Note that the list returned by the loader (Metadata in this case)
only contains the packages matched by the patterns. Their dependencies
can be found by walking the import graph using the Imports fields.
The result of a call to a loading function is a set of Package
objects describing the packages denoted by the arguments.
These "initial" packages are in fact the roots of a graph of Packages,
the import graph, that includes complete transitive dependencies.
Clients may traverse the import graph by following the edges in the
Package.Imports map, which relates the import paths that appear in the
package's source files to the packages they import.
As noted earlier, the three loaders provide increasing amounts of detail
about the loaded packages.
Each package has three kinds of name: ID, PkgPath, and Name.
A package's ID is an unspecified identifier that uniquely
identifies it throughout the workspace, and thus may be used as a key in
a map of packages. Clients should not interpret this string, no matter
how intelligible it looks, as its structure varies across build systems.
A package's PkgPath is the name by which the package is known to the
compiler, linker, and runtime: it is the string returned by
reflect.Type.PkgPath or fmt.Sprintf("%T", x). The PkgPath is not
necessarily unique throughout the workspace; for example, an in-package
test has the same PkgPath as the package under test.
A package's Name is the identifier that appears in the "package"
declaration at the start of each of its source files,
and is the name declared when importing it into another file.
A package whose Name is "main" is linked as an executable.
Metadata loads information about package location, source files, and imports.
The loader's three entry points, Metadata, TypeCheck, and
WholeProgram, provide increasing levels of detail.
TypeCheck adds type information for all packages, including dependencies,
and type-checked syntax trees only for the packages matched by the patterns.
Metadata returns only a description of each package,
its source files and imports.
Some build systems permit build steps to generate
Go source files that are then compiled.
The Packages describing such a program report
the locations of the generated files.
The process of loading packages invokes the
underlying build system to ensure that these
files are present and up-to-date.
WholeProgram adds type-checked syntax trees for all packages,
including dependencies.
Although 'go build' does not in general allow code generation,
it does in a limited form in its support for cgo.
For a package whose source files import "C", subjecting them to cgo
preprocessing, the loader reports the location of the pure-Go source
files generated by cgo. This too may entail a partial build.
Cgo processing is disabled for Metadata queries,
or when the DisableCgo option is set.
The loaders can be configured by passing a non-nil Options struct as
the first argument. See the documentation for type Options for details.
TypeCheck additionally loads, parses, and type-checks
the source files of the initial packages,
and exposes their syntax trees and type information.
Type information for dependencies of the initial
packages is obtained not from Go source code but from
compiler-generated export data files.
Again, loading invokes the underlying build system to
ensure that these files are present and up-to-date.
Most tools should pass their command-line arguments (after any flags)
uninterpreted to the loader, so that the loader can interpret them
according to the conventions of the underlying build system.
For example, this program prints the names of the source files
for each package listed on the command line:
WholeProgram loads complete type information about
the initial packages and all of their transitive dependencies.
package main
Example:
import (
"flag"
"fmt"
"log"
pkgs, err := packages.TypeCheck(nil, flag.Args()...)
if err != nil { ... }
for _, pkg := range pkgs {
...
"golang.org/x/tools/go/packages"
)
func main() {
flag.Parse()
pkgs, err := packages.Metadata(nil, flag.Args()...)
if err != nil {
log.Fatal(err)
}
for _, pkg := range pkgs {
fmt.Print(pkg.ID, pkg.Srcs)
}
}
*/
package packages // import "golang.org/x/tools/go/packages"

View File

@ -18,11 +18,15 @@ import (
"strings"
)
// A GoTooOldError indicates that the go command predates the Go
// 1.11 features needed by this package. This error is a stopgap measure
// until the necessary features can be emulated in terms of an older go
// command, at which point this error will no longer be used.
type GoTooOldError struct{ error }
// A GoTooOldError reports that the go command
// found by exec.LookPath does not contain the necessary
// support to be used with go/packages.
// Currently, go/packages requires Go 1.11 or later.
// (We intend to issue a point release for Go 1.10
// so that go/packages can be used with updated Go 1.10 systems too.)
type GoTooOldError struct {
error
}
// golistPackages uses the "go list" command to expand the
// pattern words and return metadata for the specified packages.

View File

@ -20,94 +20,99 @@ import (
"golang.org/x/tools/go/gcexportdata"
)
// An Options holds the options for a call to Metadata, TypeCheck
// or WholeProgram to load Go packages from source code.
// An Options specifies details about how packages should be loaded.
// The loaders do not modify this struct.
// TODO(rsc): Better name would be Config.
type Options struct {
// Fset is the file set for the parser
// to use when loading the program.
Fset *token.FileSet
// Context may be used to cancel a pending call.
// Context is optional; the default behavior
// is equivalent to context.Background().
// Context specifies the context for the load operation.
// If the context is cancelled, the loader may stop early
// and return an ErrCancelled error.
// If Context is nil, the load cannot be cancelled.
Context context.Context
// The Tests flag causes the result to include any test packages
// implied by the patterns.
//
// For example, under 'go build', the "fmt" pattern ordinarily
// identifies a single importable package, but with the Tests
// flag it additionally denotes the "fmt.test" executable, which
// in turn depends on the variant of "fmt" augmented by its
// in-packages tests, and the "fmt_test" external test package.
//
// For build systems in which test names are explicit,
// this flag may have no effect.
Tests bool
// Dir is the directory in which to run the build system tool
// that provides information about the packages.
// If Dir is empty, the tool is run in the current directory.
Dir string
// DisableCgo disables cgo-processing of files that import "C",
// and removes the 'cgo' build tag, which may affect source file selection.
// By default, TypeCheck, and WholeProgram queries process such
// files, and the resulting Package.Srcs describes the generated
// files seen by the compiler.
// TODO(rsc): Drop entirely. I don't think these are the right semantics.
DisableCgo bool
// TypeChecker contains options relating to the type checker,
// such as the Sizes function.
// Env is the environment to use when invoking the build system tool.
// If Env is nil, the current environment is used.
// Like in os/exec's Cmd, only the last value in the slice for
// each environment key is used. To specify the setting of only
// a few variables, append to the current environment, as in:
//
// opt.Env = append(os.Environ(), "GOOS=plan9", "GOARCH=386")
//
// The following fields of TypeChecker are ignored:
// - Import: the Loader provides the import machinery.
// - Error: errors are reported to the Error function, below.
TypeChecker types.Config
// Error is called for each error encountered during package loading.
// Implementations must be concurrency-safe.
// If nil, the default implementation prints errors to os.Stderr.
// Errors are additionally recorded in each Package.
// Error is not used in Metadata mode.
Error func(error)
// ParseFile is called to read and parse each file,
// Implementations must be concurrency-safe.
// If nil, the default implementation uses parser.ParseFile.
// A client may supply a custom implementation to,
// for example, provide alternative contents for files
// modified in a text editor but unsaved,
// or to selectively eliminate unwanted function
// bodies to reduce the load on the type-checker.
// ParseFile is not used in Metadata mode.
ParseFile func(fset *token.FileSet, filename string) (*ast.File, error)
// Env is a list of environment variables to pass through
// to the build system's metadata query tool.
// If nil, the current process's environment is used.
Env []string
// Dir is the directory in which to run the build system's metadata query tool.
// If "", the current process's working directory is used.
Dir string
// Error is called for each error encountered during package loading.
// It must be safe to call Error simultaneously from multiple goroutines.
// In addition to calling Error, the loader will record each error
// in the corresponding Package's Errors list.
// If Error is nil, the loader will print errors to os.Stderr.
// To disable printing of errors, set opt.Error = func(error){}.
// TODO(rsc): What happens in the Metadata loader? Currently nothing.
Error func(error)
// Fset is the token.FileSet to use when parsing loaded source files.
// If Fset is nil, the loader will create one.
Fset *token.FileSet
// ParseFile is called to read and parse each file
// when preparing a package's type-checked syntax tree.
// It must be safe to call ParseFile simultaneously from multiple goroutines.
// If ParseFile is nil, the loader will uses parser.ParseFile.
//
// Setting ParseFile to a custom implementation can allow
// providing alternate file content in order to type-check
// unsaved text editor buffers, or to selectively eliminate
// unwanted function bodies to reduce the amount of work
// done by the type checker.
ParseFile func(fset *token.FileSet, filename string) (*ast.File, error)
// If Tests is set, the loader includes not just the packages
// matching a particular pattern but also any related test packages,
// including test-only variants of the package and the test executable.
//
// For example, when using the go command, loading "fmt" with Tests=true
// returns four packages, with IDs "fmt" (the standard package),
// "fmt [fmt.test]" (the package as compiled for the test),
// "fmt_test" (the test functions from source files in package fmt_test),
// and "fmt.test" (the test binary).
//
// In build systems with explicit names for tests,
// setting Tests may have no effect.
Tests bool
// TypeChecker provides additional configuration for type-checking syntax trees.
//
// The TypeCheck loader does not use the TypeChecker configuration
// for packages that have their type information provided by the
// underlying build system.
//
// The TypeChecker.Error function is ignored:
// errors are reported using the Error function defined above.
//
// The TypeChecker.Importer function is ignored:
// the loader defines an appropriate importer.
//
// The TypeChecker.Sizes are only used by the WholeProgram loader.
// The TypeCheck loader uses the same sizes as the main build.
// TODO(rsc): At least, it should. Derive these from runtime?
TypeChecker types.Config
}
// Metadata returns the metadata for a set of Go packages,
// but does not parse or type-check their source files.
// The returned packages are the roots of a directed acyclic graph,
// the "import graph", whose edges are represented by Package.Imports
// and whose transitive closure includes all dependencies of the
// initial packages.
//
// The packages are denoted by patterns, using the usual notation of the
// build system (currently "go build", but in future others such as
// Bazel). Clients should not attempt to infer the relationship between
// patterns and the packages they denote, as in general it is complex
// and many-to-many. Metadata reports an error if the patterns denote no
// packages.
//
// If Metadata was unable to expand the specified patterns to a set of
// packages, or if there was a cycle in the dependency graph, it returns
// an error. Otherwise it returns a set of loaded Packages, even if
// errors were encountered while loading some of them; such errors are
// recorded in each Package.
//
// Metadata loads and returns the Go packages named by the given patterns,
// omitting type information and type-checked syntax trees from all packages.
// TODO(rsc): Better name would be Load.
func Metadata(o *Options, patterns ...string) ([]*Package, error) {
l := &loader{mode: metadata}
if o != nil {
@ -116,41 +121,10 @@ func Metadata(o *Options, patterns ...string) ([]*Package, error) {
return l.load(patterns...)
}
// TypeCheck returns metadata, syntax trees, and type information
// for a set of Go packages.
//
// In addition to the information returned by the Metadata function,
// TypeCheck loads, parses, and type-checks each of the requested packages.
// These packages are "source packages", and the resulting Package
// structure provides complete syntax and type information.
// Due to limitations of the type checker, any package that transitively
// depends on a source package must also be loaded from source.
//
// For each immediate dependency of a source package that is not itself
// a source package, type information is obtained from export data
// files produced by the Go compiler; this mode may entail a partial build.
// The Package for these dependencies provides complete package-level type
// information (types.Package), but no syntax trees.
//
// The remaining packages, comprising the indirect dependencies of the
// packages with complete export data, may have partial package-level type
// information or perhaps none at all.
//
// For example, consider the import graph A->B->C->D->E.
// If the requested packages are A and C,
// then packages A, B, C are source packages,
// D is a complete export data package,
// and E is a partial export data package.
// (B must be a source package because it
// transitively depends on C, a source package.)
//
// Each package bears a flag, IllTyped, indicating whether it
// or one of its transitive dependencies contains an error.
// A package that is not IllTyped is buildable.
//
// Use this mode for compiler-like tools
// that analyze one package at a time.
//
// TypeCheck loads and returns the Go packages named by the given patterns.
// It includes type information in all packages, including dependencies.
// The packages named by the patterns also have type-checked syntax trees.
// TODO(rsc): Better name would be LoadTyped.
func TypeCheck(o *Options, patterns ...string) ([]*Package, error) {
l := &loader{mode: typeCheck}
if o != nil {
@ -159,14 +133,10 @@ func TypeCheck(o *Options, patterns ...string) ([]*Package, error) {
return l.load(patterns...)
}
// WholeProgram returns metadata, complete syntax trees, and complete
// type information for a set of Go packages and their entire transitive
// closure of dependencies.
// Every package in the returned import graph is a source package,
// as defined by the documentation for TypeCheck
//
// Use this mode for whole-program analysis tools.
//
// WholeProgram loads and returns the Go packages named by the given patterns.
// It includes type information and type-checked syntax trees for all packages,
// including dependencies.
// TODO(rsc): Better name would be LoadAllTyped.
func WholeProgram(o *Options, patterns ...string) ([]*Package, error) {
l := &loader{mode: wholeProgram}
if o != nil {
@ -175,71 +145,93 @@ func WholeProgram(o *Options, patterns ...string) ([]*Package, error) {
return l.load(patterns...)
}
// Package holds the metadata, and optionally syntax trees
// and type information, for a single Go package.
//
// The import graph, Imports, forms a directed acyclic graph over Packages.
// (Cycle-forming edges are not inserted into the map.)
//
// A Package is not mutated once returned.
// A Package describes a single loaded Go package.
type Package struct {
// ID is a unique, opaque identifier for a package,
// as determined by the underlying workspace.
// ID is a unique identifier for a package,
// in a syntax provided by the underlying build system.
//
// IDs distinguish packages that have the same PkgPath, such as
// a regular package and the variant of that package built
// during testing. (IDs also distinguish packages that would be
// lumped together by the go/build API, such as a regular
// package and its external tests.)
//
// Clients should not interpret the ID string as its
// structure varies from one build system to another.
// Because the syntax varies based on the build system,
// clients should treat IDs as opaque and not attempt to
// interpret them.
ID string
// PkgPath is the path of the package as understood
// by the Go compiler and by reflect.Type.PkgPath.
// PkgPath is the import path of the package during a particular build.
//
// PkgPaths are unique for each package in a given executable
// program, but are not necessarily unique within a workspace.
// For example, an importable package (fmt) and its in-package
// tests (fmt·test) may have the same PkgPath, but those
// two packages are never linked together.
// Analyses that need a unique string to identify a returned Package
// should use ID, not PkgPath. Although PkgPath does uniquely identify
// a package in a particular build, the loader may return packages
// spanning multiple builds (for example, multiple commands,
// or a package and its tests), so PkgPath is not guaranteed unique
// across all packages returned by a single load.
//
// TODO(rsc): This name should be ImportPath.
PkgPath string
// Name is the identifier appearing in the package declaration
// at the start of each source file in this package.
// The name of an executable is "main".
// Name is the package name as it appears in the package source code.
Name string
// Srcs is the list of names of this package's Go
// source files as presented to the compiler.
// Names are guaranteed to be absolute.
//
// In Metadata queries, or if DisableCgo is set,
// Srcs includes the unmodified source files even
// if they use cgo (import "C").
// In all other queries, Srcs contains the files
// resulting from cgo processing.
Srcs []string
// Errors lists any errors encountered while loading the package.
// TODO(rsc): Say something about the errors or at least their Strings,
// as far as file:line being at the beginning and so on.
Errors []error
// OtherSrcs is the list of names of non-Go source files that the package
// contains. This includes assembly and C source files.
// Names are guaranteed to be absolute.
OtherSrcs []string
// Imports maps each import path to its package
// The keys are import paths as they appear in the source files.
// Imports maps import paths appearing in the package's Go source files
// to corresponding loaded Packages.
Imports map[string]*Package
// syntax and type information (only in TypeCheck and WholeProgram modes)
Fset *token.FileSet // source position information
Files []*ast.File // syntax trees for the package's Srcs files
Errors []error // non-nil if the package had errors
Type *types.Package // type information about the package
Info *types.Info // type-checker deductions
IllTyped bool // this package or a dependency has a parse or type error
// Srcs lists the absolute file paths of the package's Go source files.
//
// If a package has typed syntax trees and the DisableCgo option is false,
// the cgo-processed output files are listed instead of the original
// source files that contained import "C" statements.
// In this case, the file paths may not even end in ".go".
// Although the original sources are not listed in Srcs, the corresponding
// syntax tree positions will still refer back to the orignal source code,
// respecting the //line directives in the cgo-processed output.
//
// TODO(rsc): Actually, in TypeCheck mode even the packages without
// syntax trees (pure dependencies) lose their original sources.
// We should fix that.
//
// TODO(rsc): This should be GoFiles.
Srcs []string
// OtherSrcs lists the absolute file paths of the package's non-Go source files,
// including assembly, C, C++, Fortran, Objective-C, SWIG, and so on.
//
// TODO(rsc): This should be OtherFiles.
OtherSrcs []string
// Type is the type information for the package.
// The TypeCheck and WholeProgram loaders set this field for all packages.
// TODO(rsc): This should be Types.
Type *types.Package
// IllTyped indicates whether the package has any type errors.
// The TypeCheck and WholeProgram loaders set this field for all packages.
IllTyped bool
// Files is the package's syntax trees, for the files listed in Srcs.
//
// The TypeCheck loader sets Files for packages matching the patterns.
// The WholeProgram loader sets Files for all packages, including dependencies.
//
// TODO(rsc): This should be ASTs or Syntax.
Files []*ast.File
// Info is the type-checking results for the package's syntax trees.
// It is set only when Files is set.
//
// TODO(rsc): This should be TypesInfo.
Info *types.Info
// Fset is the token.FileSet for the package's syntax trees listed in Files.
// It is set only when Files is set.
// All packages loaded together share a single Fset.
Fset *token.FileSet
// ---- temporary state ----
// the Package struct should be pure exported data.
// export holds the path to the export data file
// for this package, if mode == TypeCheck.
@ -734,9 +726,13 @@ func (ld *loader) loadFromExportData(lpkg *Package) (*types.Package, error) {
return tpkg, nil
}
// All returns a new map containing all the transitive dependencies of
// the specified initial packages, keyed by ID.
func All(initial []*Package) map[string]*Package {
// All returns a map, from package ID to package,
// containing the packages in the given list and all their dependencies.
// Each call to All returns a new map.
//
// TODO(rsc): I don't understand why this function exists.
// It might be more useful to return a slice in dependency order.
func All(list []*Package) map[string]*Package {
all := make(map[string]*Package)
var visit func(p *Package)
visit = func(p *Package) {
@ -747,7 +743,7 @@ func All(initial []*Package) map[string]*Package {
}
}
}
for _, p := range initial {
for _, p := range list {
visit(p)
}
return all