1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
|
The following is the intended content model for the metadata content repository:
.
`-- repositories/
`-- central/
|-- config/
| |-- name=
| |-- storageUrl=
| `-- uri=
|-- content/
| `-- org/
| `-- apache/
| |-- archiva/
| | `-- platform/
| | |-- scanner/
| | | |-- 1.0-SNAPSHOT/
| | | | |-- scanner-1.0-20091120.012345-1.pom/
| | | | | |-- asc=
| | | | | |-- created=
| | | | | |-- fileCreated=
| | | | | |-- fileLastModified=
| | | | | |-- maven:buildNumber=
| | | | | |-- maven:classifier
| | | | | |-- maven:timestamp=
| | | | | |-- maven:type=
| | | | | |-- md5=
| | | | | |-- sha1=
| | | | | |-- size=
| | | | | |-- updated=
| | | | | `-- version=
| | | | |-- ciManagement.system=
| | | | |-- ciManagement.url=
| | | | |-- created=
| | | | |-- dependencies.0.artifactId=
| | | | |-- dependencies.0.classifier=
| | | | |-- dependencies.0.groupId=
| | | | |-- dependencies.0.optional=
| | | | |-- dependencies.0.scope=
| | | | |-- dependencies.0.systemPath=
| | | | |-- dependencies.0.type=
| | | | |-- dependencies.0.version=
| | | | |-- description=
| | | | |-- individuals.0.email=
| | | | |-- individuals.0.name=
| | | | |-- individuals.0.properties.scmId=
| | | | |-- individuals.0.roles.0=
| | | | |-- individuals.0.timezone=
| | | | |-- issueManagement.system=
| | | | |-- issueManagement.url=
| | | | |-- licenses.0.name=
| | | | |-- licenses.0.url=
| | | | |-- mailingLists.0.mainArchiveUrl=
| | | | |-- mailingLists.0.name=
| | | | |-- mailingLists.0.otherArchives.0=
| | | | |-- mailingLists.0.postAddress=
| | | | |-- mailingLists.0.subscribeAddress=
| | | | |-- mailingLists.0.unsubscribeAddress=
| | | | |-- maven:buildExtensions.0.artifactId=
| | | | |-- maven:buildExtensions.0.groupId=
| | | | |-- maven:buildExtensions.0.version=
| | | | |-- maven:packaging=
| | | | |-- maven:parent.artifactId=
| | | | |-- maven:parent.groupId=
| | | | |-- maven:parent.version=
| | | | |-- maven:plugins.0.artifactId=
| | | | |-- maven:plugins.0.groupId=
| | | | |-- maven:plugins.0.reporting=
| | | | |-- maven:plugins.0.version=
| | | | |-- maven:properties.mavenVersion=
| | | | |-- maven:repositories.0.id=
| | | | |-- maven:repositories.0.layout=
| | | | |-- maven:repositories.0.name=
| | | | |-- maven:repositories.0.plugins=
| | | | |-- maven:repositories.0.releases=
| | | | |-- maven:repositories.0.snapshots=
| | | | |-- maven:repositories.0.url=
| | | | |-- name=
| | | | |-- organization.favicon=
| | | | |-- organization.logo=
| | | | |-- organization.name=
| | | | |-- organization.url=
| | | | |-- relocatedTo.namespace=
| | | | |-- relocatedTo.project=
| | | | |-- relocatedTo.projectVersion=
| | | | |-- scm.connection=
| | | | |-- scm.developerConnection=
| | | | |-- scm.url=
| | | | |-- updated=
| | | | `-- url=
| | | `-- maven:artifactId=
| | `-- maven:groupId=
| `-- maven/
| `-- plugins/
| |-- maven:groupId=
| |-- maven:plugins.compiler.artifactId=
| `-- maven:plugins.compiler.name=
|-- facets/
| |-- org.apache.archiva.audit/
| | `-- 2010/
| | `-- 01/
| | `-- 19/
| | `-- 093600.000/
| | |-- action=
| | |-- artifact.id=
| | |-- artifact.namespace=
| | |-- artifact.projectId=
| | |-- artifact.version=
| | |-- remoteIP=
| | `-- user=
| |-- org.apache.archiva.metadata.repository.stats/
| | `-- 2009/
| | `-- 12/
| | `-- 03/
| | `-- 090000.000/
| | |-- scanEndTime=
| | |-- scanStartTime=
| | |-- totalArtifactCount=
| | |-- totalArtifactFileSize=
| | |-- totalFileCount=
| | |-- totalGroupCount=
| | `-- totalProjectCount=
| `-- org.apache.archiva.reports/
`-- references/
`-- org/
`-- apache/
`-- archiva/
|-- parent/
| `-- 1/
| `-- references/
| `-- org/
| `-- apache/
| `-- archiva/
| |-- platform/
| | `-- scanner/
| | `-- 1.0-SNAPSHOT/
| | `-- referenceType=parent
| `-- web/
| `-- webapp/
| `-- 1.0-SNAPSHOT/
| `-- referenceType=parent
`-- platform/
`-- scanner/
`-- 1.0-SNAPSHOT/
`-- references/
`-- org/
`-- apache/
`-- archiva/
`-- web/
`-- webapp/
`-- 1.0-SNAPSHOT/
`-- referenceType=dependency
(To update - run "tree --dirsfirst -F" on the unpacked content-model.zip from the sandbox)
Notes:
*) config should be reflected to an external configuration file and only stored in the content repository for purposes
of accessing through a REST API, for example
*) In the above example, we have the following coordinates:
- namespace = org.apache.archiva.platform (namespaces are of arbitrary depth, and are project namespaces, not to be
confused with JCR's item/node namespaces)
- project = scanner
- version = 1.0-SNAPSHOT
- artifact = scanner-1.0-20091120.012345-1.pom
*) filename (scanner-1.0-20091120.012345-1.pom) is a node, and each is distinct except for checksums, etc.
*) the top level version (1.0-SNAPSHOT) is the version best used to describe the project (the "marketed version"). It
must still be unique for lookup and comparing project versions to each other, but can contain several different
"build" artifacts.
*) Projects are just a single code project. They do not have subprojects - if such modeling needs to be done, then we
can create a products tree that will map what "Archiva 1.0" contains from the other repositories.
*) There is not Maven-native information here, other than that in the maven: namespace. pom & other files are not
treated as special - they are each stored and it is up to the reader to interpret
*) artifact data is not stored in the metadata repository (there is no data= property on the file). The information here
is enough to locate the file in the original storageUrl when it is requested
*) The API will still use separate namespace and project identifiers (the namespace can be null if there isn't one).
This is chosen to allow splitting the namespace on '.', and also allowing '.' in the project identifier without
splitting
*) properties with '.' may be nested in other representations such as Java models or XML, if appropriate
*) we only keep one set of project information for a "version" - this differs from Maven's storage of one POM per
snapshot. The Maven 2 module will take the latest. Those that need Maven's behaviour should retrieve the POM
directly. Implementations are also free to store as much information as desired within the artifact node in addition
to whatever is shared in the project version node.
*) while some information is stored at the most generic level in the metadata repository (eg maven:groupId,
maven:artifactId), for convenience when loaded by the implementation it may all be pushed into the projectVersion's
information. The metadata repository implementation can decide how best to store and retrieve the information.
*) created/updated timestamps may be maintained by the metadata repository implementation for the metadata itself.
Timestamps for individual files are stored as additional properties (fileCreated, fileLastModified). It may make
sense to add a "discovered" timestamp if an artifact is known to be created at a different time to which it is added
to the metadata repository.
*) references are stored outside the main model so that their creation doesn't imply a "stub" model - we know if the
project exists whether a reference is created or not. References need not infer referential integrity.
*) some of the above needs to be reviewed before going into production. For example:
- the maven specific aspects of dependencies should become a faceted part of the content
- more of the metadata might be faceted in general, keeping the content model basic by default
- determine if any of the stats can be derived by functions of the content repository rather than storing and trying
to keep them up to date. Historical data might be retained by versioning and taking a snapshot at a given point in
time. The current approach of tying them to the scanning process is not optimal
- the storing of metadata as 0-indexed lists would be better in as child nodes. This might require additional levels
in the current repository (.../scanner/versions/1.0-SNAPSHOT/artifacts/scanner-1.0-20091120.012345-1.pom), or
for listed information to be in a separate tree
(/metadata/org/apache/archiva/platform/scanner/1.0-SNAPSHOT/mailingLists/users), or to use some 'reserved names'
for nodes (by using a content repository's namespacing capabilities). The first has the advantage of
keeping information together but a longer path name and less familiarity to Maven users. The second arbitrarily
divides metadata. The third option seems preferable but needs more investigation at this stage.
*) Future possibilities:
- audit metadata on artifacts (who uploaded, when, and how), or whether it was discovered by scanning
|