seth vidal
2010-08-04 14:59:58 UTC
Hi folks,
instead of trundling down the existing thread I thought we could
[re]start a discussion of things that could be improved in future
versions of the repodata and laying out the data.
there have been a number of times where I wished we could have arranged
some of the metadata differently and there is a lot of need for
optimizations of what has to be downloaded. I've been thinking about and
discussing it casually with various folks for quite a while but not
acted on anything.
there was a discussion in June on yum-devel:
http://lists.baseurl.org/pipermail/yum-devel/2010-June/007123.html
Let's be clear, though, given how repodata/repomd.xml is laid out there
is no requirement to break backward compat, we can add datatypes and
move forward that way. so there's no reason for drama or gnashing of
teeth. The only trick will be if all repos support all versions of the
metadata, either by their own choice, or by requirement.
The general idea has been to:
- make the metadata more 'chunkable' so you can retrieve smaller pieces
of it and only retrieve more complete sets when you need them or when
there is no disadvantage to getting everything in one blob.
- make it more trivial to search
- keep it as verifiable as it is now
- make it possible to know when a repo has 'expired', if ever
- not break everyone in the process
- make it easier to provide translatable chunks for the
summary/description fields of pkgs
- make it more obvious where/how external metadata(external to the rpms
themselves) can be added to a repository.
The reasons for these changes are fairly simple: with repos growing in
size it is becoming less and less manageable to download huge xml or
sqlite files to update your local cache.
-sv
instead of trundling down the existing thread I thought we could
[re]start a discussion of things that could be improved in future
versions of the repodata and laying out the data.
there have been a number of times where I wished we could have arranged
some of the metadata differently and there is a lot of need for
optimizations of what has to be downloaded. I've been thinking about and
discussing it casually with various folks for quite a while but not
acted on anything.
there was a discussion in June on yum-devel:
http://lists.baseurl.org/pipermail/yum-devel/2010-June/007123.html
Let's be clear, though, given how repodata/repomd.xml is laid out there
is no requirement to break backward compat, we can add datatypes and
move forward that way. so there's no reason for drama or gnashing of
teeth. The only trick will be if all repos support all versions of the
metadata, either by their own choice, or by requirement.
The general idea has been to:
- make the metadata more 'chunkable' so you can retrieve smaller pieces
of it and only retrieve more complete sets when you need them or when
there is no disadvantage to getting everything in one blob.
- make it more trivial to search
- keep it as verifiable as it is now
- make it possible to know when a repo has 'expired', if ever
- not break everyone in the process
- make it easier to provide translatable chunks for the
summary/description fields of pkgs
- make it more obvious where/how external metadata(external to the rpms
themselves) can be added to a repository.
The reasons for these changes are fairly simple: with repos growing in
size it is becoming less and less manageable to download huge xml or
sqlite files to update your local cache.
-sv