From b4a510f1cdfefd9edfe6847798519704377a7d4c Mon Sep 17 00:00:00 2001 From: Brett Porter Date: Wed, 2 Dec 2009 04:45:50 +0000 Subject: [PATCH] [MRM-1025] update the content model with information from the former project model database. Not all of this is stored in the current implementation as it wasn't previously used. git-svn-id: https://svn.apache.org/repos/asf/archiva/branches/MRM-1025@886048 13f79535-47bb-0310-9956-ffa450edef68 --- archiva-modules/metadata/content-model.txt | 161 ++++++++++++++++----- 1 file changed, 127 insertions(+), 34 deletions(-) diff --git a/archiva-modules/metadata/content-model.txt b/archiva-modules/metadata/content-model.txt index 6f9511572..007c9e552 100644 --- a/archiva-modules/metadata/content-model.txt +++ b/archiva-modules/metadata/content-model.txt @@ -11,48 +11,141 @@ The following is the intended content model for the metadata content repository: | `-- org/ | `-- apache/ | `-- archiva/ - | `-- platform/ -- these are known as the namespace, of arbitrary depth. Equiv to groupId in Maven - | `-- scanner/ -- this is the project - equivalent to artifactId in Maven - | |-- 1.0-SNAPSHOT/ -- this is the version best used to describe the project ("marketed version") - | | |-- scanner-1.0-20091120.012345-1.pom/ -- filename is a node, each is distinct except for checksums, etc. - | | | |-- asc= - | | | |-- created= - | | | |-- maven:buildNumber= - | | | |-- maven:packaging= - | | | |-- maven:timestamp= - | | | |-- md5= - | | | |-- sha1= - | | | |-- size= - | | | |-- updated= - | | | `-- version= -- the actual version of the file, 1.0-20091120.012345-1 - | | |-- created= - | | |-- description= - | | |-- name= - | | |-- organization.name= - | | |-- organization.url= - | | `-- updated= - | |-- maven:artifactId= - | `-- maven:groupId= + | `-- platform/ + | |-- scanner/ + | | |-- 1.0-SNAPSHOT/ + | | | |-- scanner-1.0-20091120.012345-1.pom/ + | | | | |-- asc= + | | | | |-- created= + | | | | |-- maven:buildNumber= + | | | | |-- maven:classifier= + | | | | |-- maven:timestamp= + | | | | |-- maven:type= + | | | | |-- md5= + | | | | |-- sha1= + | | | | |-- size= + | | | | |-- updated= + | | | | `-- version= + | | | |-- ciManagement.system= + | | | |-- ciManagement.url= + | | | |-- created= + | | | |-- dependencies.0.artifactId= + | | | |-- dependencies.0.classifier= + | | | |-- dependencies.0.groupId= + | | | |-- dependencies.0.optional= + | | | |-- dependencies.0.scope= + | | | |-- dependencies.0.systemPath= + | | | |-- dependencies.0.type= + | | | |-- dependencies.0.version= + | | | |-- description= + | | | |-- individuals.0.email= + | | | |-- individuals.0.name= + | | | |-- individuals.0.properties.scmId= + | | | |-- individuals.0.roles.0= + | | | |-- individuals.0.timezone= + | | | |-- issueManagement.system= + | | | |-- issueManagement.url= + | | | |-- licenses.0.name= + | | | |-- licenses.0.url= + | | | |-- mailingLists.0.mainArchiveUrl= + | | | |-- mailingLists.0.name= + | | | |-- mailingLists.0.otherArchives.0= + | | | |-- mailingLists.0.postAddress= + | | | |-- mailingLists.0.subscribeAddress= + | | | |-- mailingLists.0.unsubscribeAddress= + | | | |-- maven:buildExtensions.0.artifactId= + | | | |-- maven:buildExtensions.0.groupId= + | | | |-- maven:buildExtensions.0.version= + | | | |-- maven:packaging= + | | | |-- maven:parent.artifactId= + | | | |-- maven:parent.groupId= + | | | |-- maven:parent.version= + | | | |-- maven:plugins.0.artifactId= + | | | |-- maven:plugins.0.groupId= + | | | |-- maven:plugins.0.reporting= + | | | |-- maven:plugins.0.version= + | | | |-- maven:properties.mavenVersion= + | | | |-- maven:repositories.0.id= + | | | |-- maven:repositories.0.layout= + | | | |-- maven:repositories.0.name= + | | | |-- maven:repositories.0.plugins= + | | | |-- maven:repositories.0.releases= + | | | |-- maven:repositories.0.snapshots= + | | | |-- maven:repositories.0.url= + | | | |-- name= + | | | |-- organization.favicon= + | | | |-- organization.logo= + | | | |-- organization.name= + | | | |-- organization.url= + | | | |-- relocatedTo.namespace= + | | | |-- relocatedTo.project= + | | | |-- relocatedTo.projectVersion= + | | | |-- scm.connection= + | | | |-- scm.developerConnection= + | | | |-- scm.url= + | | | |-- updated= + | | | `-- url= + | | `-- maven:artifactId= + | `-- maven:groupId= `-- metadata/ (To update - run "tree --dirstfirst -F" on the unpacked content-model.zip from the sandbox) Notes: -1) Projects are just a single code project. They do not have subprojects - if such modeling needs to be done, then we can create a products -tree that will map what "Archiva 1.0" contains from the other repositories. +*) In the above example, we have the following coordinates: + - namespace = org.apache.archiva.platform (namespaces are of arbitrary depth, and are project namespaces, not to be + confused with JCR's item/node namespaces) + - project = scanner + - version = 1.0-SNAPSHOT + - artifact = scanner-1.0-20091120.012345-1.pom -2) There is not Maven-native information here, other than that in the maven: namespace. pom & other files are not treated as special - they are -each stored and it is up to the reader to interpret +*) filename (scanner-1.0-20091120.012345-1.pom) is a node, and each is distinct except for checksums, etc. -3) artifact data is not stored in the content repository (there is no data= property on the file). The information here is enough to locate the -file in the original storageUrl when it is requested +*) the top level version (1.0-SNAPSHOT) is the version best used to describe the project (the "marketed version"). It + must still be unique for lookup and comparing project versions to each other, but can contain several different + "build" artifacts. -4) The API will still use separate namespace and project identifiers (the namespace can be null if there isn't one). This is chosen to allow -splitting the namespace on '.', and also allowing '.' in the project identifier without splitting +*) Projects are just a single code project. They do not have subprojects - if such modeling needs to be done, then we + can create a products tree that will map what "Archiva 1.0" contains from the other repositories. -5) properties with '.' may be nested in other representations such as Java models or XML, if appropriate +*) There is not Maven-native information here, other than that in the maven: namespace. pom & other files are not + treated as special - they are each stored and it is up to the reader to interpret + +*) artifact data is not stored in the metadata repository (there is no data= property on the file). The information here + is enough to locate the file in the original storageUrl when it is requested + +*) The API will still use separate namespace and project identifiers (the namespace can be null if there isn't one). + This is chosen to allow splitting the namespace on '.', and also allowing '.' in the project identifier without + splitting + +*) properties with '.' may be nested in other representations such as Java models or XML, if appropriate + +*) we only keep one set of project information for a "version" - this differs from Maven's storage of one POM per + snapshot. The Maven 2 module will take the latest. Those that need Maven's behaviour should retrieve the POM + directly. Implementations are also free to store as much information as desired within the artifact node in addition + to whatever is shared in the project version node. + +*) while some information is stored at the most generic level in the metadata repository (eg maven:groupId, + maven:artifactId), for convenience when loaded by the implementation it may all be pushed into the projectVersion's + information. The metadata repository implementation can decide how best to store and retrieve the information. + +*) created/updated timestamps may be maintained by the metadata repository implementation + +*) references are stored outside the main model so that their creation doesn't imply a "stub" model - we know if the + project exists whether a reference is created or not. References need not infer referential integrity + +*) some of the above needs to be reviewed before going into production. For example: + - the maven specific aspects of dependencies should become a faceted part of the content + - more of the metadata might be faceted in general, keeping the content model basic by default + - the storing of metadata as 0-indexed lists would be better in as child nodes. This might require additional levels + in the current repository (.../scanner/versions/1.0-SNAPSHOT/artifacts/scanner-1.0-20091120.012345-1.pom), or + for listed information to be in a separate tree + (/metadata/org/apache/archiva/platform/scanner/1.0-SNAPSHOT/mailingLists/users), or to use some 'reserved names' + for nodes (by using a content repository's namespacing capabilities). The first has the advantage of + keeping information together but a longer path name and less familiarity to Maven users. The second arbitrarily + divides metadata. The third option seems preferable but needs more investigation at this stage. + +*) Future possibilities: + - audit metadata on artifacts (who uploaded, when, and how), or whether it was discovered by scanning -6) we only keep one set of project information for a "version" - this differs from Maven's storage of one POM per snapshot. The Maven 2 module will - take the latest. Those that need Maven's behaviour should retrieve the POM directly. Implementations are also free to store as much information - as desired within the artifact node in addition to whatever is shared in the project version node. \ No newline at end of file -- 2.39.5