Post by James AntillPost by Anders F BjörklundPost by James AntillPost by Anders F BjörklundopenSUSE actually uses LZMA, not XZ. Both for RPMS and for
repodata.
Using liblzma would handle both, but feeding xz to lzma doesn't work.
Does that matter? The rpm payload is inside rpm, so matters
less, and
can't be changed now anyway. But for "normal" files the convention
is to use .xz now ... no?
I thought the "convention" was .gz, since that's the only choice given.
For repomd, yes .gz/.bz2 is the only std. target atm. I meant as a
general answer it's common to use .xz but not .lzma, Eg.
ftp://ftp.gnu.org/gnu/coreutils/ now has .gz and .xz for the latest
releases.
Sure, but it said "in openSUSE" above, and not "in general". :-)
It doesn't matter, as when I said "LZMA" I meant either lzma/xz,
as in either of the (legacy) LZMA_Alone or the XZ file formats.
unxz and pyliblzma handles both (xzdec only handled .xz - my bad)
Post by James AntillPost by Anders F BjörklundEither way it's not as much of a gain as for yum which can use the
.sqlite files directly. Both the .xml and .sqlite need converting,
into the internal format. And so far changing hasn't been worth it.
Yes, we understand yum gets a bigger speedup than zypper will because
we use the .sqlite directly. I'm even willing to concede that a custom
DB can be faster than sqlite (although I doubt it is
_significant_), for
the data it is designed for.
I don't know much about zypper, so will let duncanmv answer that.
Was talking about Smart, which reads everything into the "cache".
Post by James AntillHowever it seems like a very worthwhile goal, to me, to only have one
set of MD. And for that set of MD to not require worthless conversions
on each client. There is currently only primary_db in upstream
createrepo, which meets those needs.
I thought that the .xml was transparently converted to .sqlite
on the client with the use of the "yum-metadata-parser" module ?
Having the .sqlite in the repodata is more like a pre-compute,
especially if you are not going to use the database afterwards.
Post by James AntillSo, while we aren't going to remove "primary" generation support
tomorrow, it is very much a second class type already IMO.
Post by Anders F BjörklundPost by James Antill.sqlite is about as extensible as XML, and primary/etc. have never
changed (and the coming changes are just as likely to be done by adding
new files).
It's easier to add new attributes and tags, than new columns and tables.
I would disagree, you can do a single call in sqlite to see if a table
or column exists and if so it will be there for every value ... XML is
much less conforming. At worst I'd say it's the same.
OK. I'll try to add the "Requires(hint):" to the sqlite as well.
It should only be an extra column of "hint BOOLEAN DEFAULT FALSE"
Post by James AntillPost by Anders F BjörklundPost by James AntillHowever there is a cost to having 4-8 different versions of
primary/filelists/etc. ... both in createrepo time, in hosting disk
space, in maintenance of all the weird code paths and in repomd.xml size
(most repos. aren't using metalink, so repomd.xml is downloaded a lot).
I'm not sure where this 4-8 number came from. We were talking
about 2,
or 3 if you want to include the .sqlite files too (which are
different).
repomd.xml
primary.xml.gz (or primary.xml)
primary.xml.xz (or primary.xml.lzma)
primary.sqlite.bz2
primary.gz
primary.lzma
primary.xz
primary_db.bz2
primary_db.xz
primary_solv.xz
[...]
Everything ? No, the patch was to add just "primary_lzma".
(with the addition of "filelists_lzma" and "other_lzma")
There is no need to have both of .lzma and .xz, and the
.sqlite and .solv are mostly useful for yum and zypper...
The only other addition I made was to add an ".index"
file, so that one could seek a specific pkgid quickly.
(it's just a text file with "$pkgid\t\$offset\n" lines,
and to the uncompressed stream so only one index needed)
But that index file is also easy to compute afterwards.
Sample program was like 50 lines of python or something.
So for a generic repo there would be only be *two* files,
the "compat" .xml.gz and either of .xml.xz / .sqlite.bz2
primary.xml.gz
primary.xml.xz
Post by James Antillmini_primary.xz
[...]
...but, obviously that is "in the future", as no code has been written
for any of the proposed repodata formats.
Sounds like a totally different discussion, as per thread ?
The "primary_lzma" type addition was definitely here-and-now.
--anders