Discussion:
[xstream-user] XStream keeps XPath refs to ints, doubles, other Immutables
Geoff Groos
2015-04-21 19:53:07 UTC
Permalink
Hey guys,

Deserialization of our model (using XStream) currently requires upwards of 500mb of memory. I’ve been tasked with fixing this since, in the worst case, our model is a matrix with an upper bound of 10,000 rows and 1,000 columns (10mb big).

Looking at our deserialization process under the profiler, I noticed that XStream’s xpath map was consuming upwards of 95% of that space. Digging in a little deeper, I noticed the XPath map keeping paths to Booleans and Integers.

What’s the purpose of keeping an XPath entry to an instance of an immutable type?

In other words, If I had

class XPathAnnoyer{
public Integer x; //using Object Integer, an immutable type.
public Integer y;
}

Integer intInstance = new Integer(20);
XPathAnnoyer instance = new XPathAnnoyer();
instance.x = intInstance;
instance.y = intInstance;

String result = xstream.serialize(instance);

Result would be
<XPathAnnoyer>
<x>20</x>
<y>20</y>
</XPathAnnoyer>

and the xpath map will contain the (completely useless entry)
“***@ABC123” -> “root/dotdotdot/XPathAnnoyer/x”

So when would that entry ever get used? Why bother keeping it?

My argument against keeping it is that I’m hoping I can get my memory profile down significantly.

Let me know if my understanding of this issue is not correct, or if there’s some obvious case I’m overlooking.

thanks for any help,

-Geoff
Jörg Schaible
2015-04-21 21:56:48 UTC
Permalink
Hi Geoff,
Post by Geoff Groos
Hey guys,
Deserialization of our model (using XStream) currently requires upwards of
500mb of memory. I’ve been tasked with fixing this since, in the worst
case, our model is a matrix with an upper bound of 10,000 rows and 1,000
columns (10mb big).
Looking at our deserialization process under the profiler, I noticed that
XStream’s xpath map was consuming upwards of 95% of that space. Digging in
a little deeper, I noticed the XPath map keeping paths to Booleans and
Integers.
What’s the purpose of keeping an XPath entry to an instance of an
immutable type?
In other words, If I had
class XPathAnnoyer{
public Integer x; //using Object Integer, an immutable type.
public Integer y;
}
Integer intInstance = new Integer(20);
XPathAnnoyer instance = new XPathAnnoyer();
instance.x = intInstance;
instance.y = intInstance;
String result = xstream.serialize(instance);
Result would be
<XPathAnnoyer>
<x>20</x>
<y>20</y>
</XPathAnnoyer>
and the xpath map will contain the (completely useless entry)
So when would that entry ever get used? Why bother keeping it?
My argument against keeping it is that I’m hoping I can get my memory
profile down significantly.
Let me know if my understanding of this issue is not correct, or if
there’s some obvious case I’m overlooking.
Which xpath map you're talking about and which XStream version? XStream
keeps the references in the AbstractReferenceMarshaller.references and that
collection does not contain references to immutable types.

Cheers,
Jörg


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email
Geoff Groos
2015-04-21 22:34:49 UTC
Permalink
I'm on XStream 1.4.7 (I'll get myself current in a moment)

I'm looking at line 67 of the AbstractReferenceUnmarshaller where it's putting the integer `10` in as a value for the reference-key ".../myModelMatrixClass/widthOfMatrix". The field corresponding to that path is an 'int'.

the myModelMatrix type does use a custom converter which extends the ReflectionConverter, but it uses the stock `unmarshalField` method to unmarshall that field, which in turn is hitting the 'abstractReferenceUnmarshaller' to unmarshall the '10' value at the element <widthOfMatrix>10</widthOfMatrix>

I hope I'm describing this accurately. I could start putting together an SSCCE if it would help.

-Geoff

From: J?rg Schaible<mailto:***@gmx.de>
Sent: ?Tuesday?, ?April? ?21?, ?2015 ?2?:?57? ?PM
To: ***@xstream.codehaus.org<mailto:***@xstream.codehaus.org>

Hi Geoff,
Post by Geoff Groos
Hey guys,
Deserialization of our model (using XStream) currently requires upwards of
500mb of memory. I've been tasked with fixing this since, in the worst
case, our model is a matrix with an upper bound of 10,000 rows and 1,000
columns (10mb big).
Looking at our deserialization process under the profiler, I noticed that
XStream's xpath map was consuming upwards of 95% of that space. Digging in
a little deeper, I noticed the XPath map keeping paths to Booleans and
Integers.
What's the purpose of keeping an XPath entry to an instance of an
immutable type?
In other words, If I had
class XPathAnnoyer{
public Integer x; //using Object Integer, an immutable type.
public Integer y;
}
Integer intInstance = new Integer(20);
XPathAnnoyer instance = new XPathAnnoyer();
instance.x = intInstance;
instance.y = intInstance;
String result = xstream.serialize(instance);
Result would be
<XPathAnnoyer>
<x>20</x>
<y>20</y>
</XPathAnnoyer>
and the xpath map will contain the (completely useless entry)
So when would that entry ever get used? Why bother keeping it?
My argument against keeping it is that I'm hoping I can get my memory
profile down significantly.
Let me know if my understanding of this issue is not correct, or if
there's some obvious case I'm overlooking.
Which xpath map you're talking about and which XStream version? XStream
keeps the references in the AbstractReferenceMarshaller.references and that
collection does not contain references to immutable types.

Cheers,
J?rg


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email
Jörg Schaible
2015-04-22 00:34:40 UTC
Permalink
Hi Geoff,
Post by Geoff Groos
I'm on XStream 1.4.7 (I'll get myself current in a moment)
I'm looking at line 67 of the AbstractReferenceUnmarshaller where it's
putting the integer `10` in as a value for the reference-key
".../myModelMatrixClass/widthOfMatrix". The field corresponding to that
path is an 'int'.
the myModelMatrix type does use a custom converter which extends the
ReflectionConverter, but it uses the stock `unmarshalField` method to
unmarshall that field, which in turn is hitting the
'abstractReferenceUnmarshaller' to unmarshall the '10' value at the
element <widthOfMatrix>10</widthOfMatrix>
I hope I'm describing this accurately. I could start putting together an
SSCCE if it would help.
OK. My bad, I've looked at the marshaller only. Basically you're right, but
if we throw away the references for immutables, we might introduce a
compatibility problem:

==================== %< =========================
UUID[] uuidArray = new UUID[2];
uuidArray[0] = uuidArray[1] = UUID.randomUUID();
XStream xstream = new XStream();
String xml = xstream.toXML(uuidArray);

// unmarshal in later version of application with a changed setup:
XStream xstream = new XStream();
xstrea.addImmutable(UUID.class);
xstream.fromXML(xml); // XStreamException, reference not found
==================== %< =========================

See, I've already set UUID as immutable for 1.4.9. If we drop also the
references to any immutable, anybody with an XML containing a reference to a
UUID will fail at deserialization. :-/

Cheers,
Jörg


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email
Geoff Groos
2015-04-22 03:41:50 UTC
Permalink
Thanks for the quick replies J?rg,

with my software being at version 0.9, with no customers yet, I'm all for compatibility breaking ?. Especially since this is an issue blocking our release.?

That said I suspect the vast majority of your users wouldn't be.

Perhaps a nice middle ground is to simply never include xpaths for primitives, which I suspect you've been serializing plainly for quite some time? Alternatively, make it user configurable? Our XStream configuration is already fairly extensive, I don't see much harm in making it a little bit more extensive, but I'm sure no library author likes to hear that.

void addImmutableType(Class clazz) { addImmutableType(clazz, XPathPolicy.BACKWARDS_COMPATIBLE); }
void addImmutableType(Class clazz, XPathPolicy pathPolicy){ ... }

enum XPathPolicy{ ALWAYS_RETAIN_PATH, BACKWARDS_COMPATIBLE, NEVER_RETAIN_PATH }

with `BACKWARDS_COMPATIBLE` meaning don't keep the map during serialization, but do keep the map on deserialization.

I've also only thrown the ALWAYS_RETAIN_PATH option up there for completeness, I have no idea why you would want it. I suppose if somebody's doing something particularly complex in a custom converter, they could conceivably use the immutable type concept for purpose known only to them, their domain, and their converter, and involve this 'immutable type but always use paths' mode.

I'm going to write this up now.

Thanks for the help and the great library!

-Geoff

From: J?rg Schaible<mailto:***@gmx.de>
Sent: ?Tuesday?, ?April? ?21?, ?2015 ?5?:?35? ?PM
To: ***@xstream.codehaus.org<mailto:***@xstream.codehaus.org>

Hi Geoff,
Post by Geoff Groos
I'm on XStream 1.4.7 (I'll get myself current in a moment)
I'm looking at line 67 of the AbstractReferenceUnmarshaller where it's
putting the integer `10` in as a value for the reference-key
".../myModelMatrixClass/widthOfMatrix". The field corresponding to that
path is an 'int'.
the myModelMatrix type does use a custom converter which extends the
ReflectionConverter, but it uses the stock `unmarshalField` method to
unmarshall that field, which in turn is hitting the
'abstractReferenceUnmarshaller' to unmarshall the '10' value at the
element <widthOfMatrix>10</widthOfMatrix>
I hope I'm describing this accurately. I could start putting together an
SSCCE if it would help.
OK. My bad, I've looked at the marshaller only. Basically you're right, but
if we throw away the references for immutables, we might introduce a
compatibility problem:

==================== %< =========================
UUID[] uuidArray = new UUID[2];
uuidArray[0] = uuidArray[1] = UUID.randomUUID();
XStream xstream = new XStream();
String xml = xstream.toXML(uuidArray);

// unmarshal in later version of application with a changed setup:
XStream xstream = new XStream();
xstrea.addImmutable(UUID.class);
xstream.fromXML(xml); // XStreamException, reference not found
==================== %< =========================

See, I've already set UUID as immutable for 1.4.9. If we drop also the
references to any immutable, anybody with an XML containing a reference to a
UUID will fail at deserialization. :-/

Cheers,
J?rg


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email
Geoff Groos
2015-04-22 22:33:14 UTC
Permalink
Hey guys,

So I whipped together this quick patch: https://github.com/Groostav/XStream-GG/pull/1

I think I'd like to modify it to remove the `Context` object --its too big a klooge to have the configuration need to know if the caller is Unmarshalling or Marshalling. I'll get around to that today.

let me know what you think.

-Geoff

From: Geoff Groos<mailto:***@empowerops.com>
Sent: ?Tuesday?, ?April? ?21?, ?2015 ?8?:?42? ?PM
To: ***@xstream.codehaus.org<mailto:***@xstream.codehaus.org>

Thanks for the quick replies J?rg,

with my software being at version 0.9, with no customers yet, I'm all for compatibility breaking ?. Especially since this is an issue blocking our release.?

That said I suspect the vast majority of your users wouldn't be.

Perhaps a nice middle ground is to simply never include xpaths for primitives, which I suspect you've been serializing plainly for quite some time? Alternatively, make it user configurable? Our XStream configuration is already fairly extensive, I don't see much harm in making it a little bit more extensive, but I'm sure no library author likes to hear that.

void addImmutableType(Class clazz) { addImmutableType(clazz, XPathPolicy.BACKWARDS_COMPATIBLE); }
void addImmutableType(Class clazz, XPathPolicy pathPolicy){ ... }

enum XPathPolicy{ ALWAYS_RETAIN_PATH, BACKWARDS_COMPATIBLE, NEVER_RETAIN_PATH }

with `BACKWARDS_COMPATIBLE` meaning don't keep the map during serialization, but do keep the map on deserialization.

I've also only thrown the ALWAYS_RETAIN_PATH option up there for completeness, I have no idea why you would want it. I suppose if somebody's doing something particularly complex in a custom converter, they could conceivably use the immutable type concept for purpose known only to them, their domain, and their converter, and involve this 'immutable type but always use paths' mode.

I'm going to write this up now.

Thanks for the help and the great library!

-Geoff

From: J?rg Schaible<mailto:***@gmx.de>
Sent: ?Tuesday?, ?April? ?21?, ?2015 ?5?:?35? ?PM
To: ***@xstream.codehaus.org<mailto:***@xstream.codehaus.org>

Hi Geoff,
Post by Geoff Groos
I'm on XStream 1.4.7 (I'll get myself current in a moment)
I'm looking at line 67 of the AbstractReferenceUnmarshaller where it's
putting the integer `10` in as a value for the reference-key
".../myModelMatrixClass/widthOfMatrix". The field corresponding to that
path is an 'int'.
the myModelMatrix type does use a custom converter which extends the
ReflectionConverter, but it uses the stock `unmarshalField` method to
unmarshall that field, which in turn is hitting the
'abstractReferenceUnmarshaller' to unmarshall the '10' value at the
element <widthOfMatrix>10</widthOfMatrix>
I hope I'm describing this accurately. I could start putting together an
SSCCE if it would help.
OK. My bad, I've looked at the marshaller only. Basically you're right, but
if we throw away the references for immutables, we might introduce a
compatibility problem:

==================== %< =========================
UUID[] uuidArray = new UUID[2];
uuidArray[0] = uuidArray[1] = UUID.randomUUID();
XStream xstream = new XStream();
String xml = xstream.toXML(uuidArray);

// unmarshal in later version of application with a changed setup:
XStream xstream = new XStream();
xstrea.addImmutable(UUID.class);
xstream.fromXML(xml); // XStreamException, reference not found
==================== %< =========================

See, I've already set UUID as immutable for 1.4.9. If we drop also the
references to any immutable, anybody with an XML containing a reference to a
UUID will fail at deserialization. :-/

Cheers,
J?rg


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email
Jörg Schaible
2015-04-27 11:46:02 UTC
Permalink
Hi Geoff,

sorry for being silent, but we're busy getting several projects out of the
Haus before it shuts down.
Post by Geoff Groos
Hey guys,
https://github.com/Groostav/XStream-GG/pull/1
I think I'd like to modify it to remove the `Context` object --its too big
a klooge to have the configuration need to know if the caller is
Unmarshalling or Marshalling. I'll get around to that today.
let me know what you think.
I had a short look at the diffs. Actually they should not contain any cruft
(.class files, ...).

For the implementation: I would have used the Mapper interface also as
"entry" point. However, you may not modify a method, you have to add a new
one instead (resp. overload a method) to keep binary compatibility. The
master branch is already for XStream 1.5.x with a minimum requirement of
Java 6.

However, I assume, you're interested in getting this into the 1.4.x branch
(Java 1.4.x compatible). Therefore you cannot use a Java enum at all, we
cannot backport it. Basically all you need is a boolean flag for the
immutable types that do not support references at all. We have no intentions
to support multiple modes here, it's just a compatibility helper.

Thanks for your effort so far,
Jörg


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email
Geoff Groos
2015-04-27 20:26:17 UTC
Permalink
Hey J?rg,

Thanks for the note about enums, I had it in my head that enum's have been in java like that since 1.0. This explains why so many APIs inside java are using int constants.

Anyways, my todos:

*
revert the method signatures to maintain binary compatibility. This means I have to use a cast in a couple places but I think its a pretty safe one (ie, I'm not sure theres a flow where the XPathStrategy can return a marshaller that isn't a XPathMarshaller).
*
I'll remove the enums and use static integers (maybe with an odd range of numbers so-as to allow for an assertion that the caller used the proper integers)

And of course when I send you any diffs I'll remove the class files --that was simply me bashing your build system into a repo that I didn't really spend much time setting up. I just wanted a concise list of the diffs, which a github pull-request provides.

Cheers,

-Geoff

Sent from Windows Mail

From: J?rg Schaible<mailto:***@swisspost.com>
Sent: ?Monday?, ?April? ?27?, ?2015 ?4?:?46? ?AM
To: ***@xstream.codehaus.org<mailto:***@xstream.codehaus.org>

Hi Geoff,

sorry for being silent, but we're busy getting several projects out of the
Haus before it shuts down.
Post by Geoff Groos
Hey guys,
https://github.com/Groostav/XStream-GG/pull/1
I think I'd like to modify it to remove the `Context` object --its too big
a klooge to have the configuration need to know if the caller is
Unmarshalling or Marshalling. I'll get around to that today.
let me know what you think.
I had a short look at the diffs. Actually they should not contain any cruft
(.class files, ...).

For the implementation: I would have used the Mapper interface also as
"entry" point. However, you may not modify a method, you have to add a new
one instead (resp. overload a method) to keep binary compatibility. The
master branch is already for XStream 1.5.x with a minimum requirement of
Java 6.

However, I assume, you're interested in getting this into the 1.4.x branch
(Java 1.4.x compatible). Therefore you cannot use a Java enum at all, we
cannot backport it. Basically all you need is a boolean flag for the
immutable types that do not support references at all. We have no intentions
to support multiple modes here, it's just a compatibility helper.

Thanks for your effort so far,
J?rg


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email
Jörg Schaible
2015-04-27 22:56:30 UTC
Permalink
Hi Geoff,
Post by Geoff Groos
Hey J?rg,
Thanks for the note about enums, I had it in my head that enum's have been
in java like that since 1.0. This explains why so many APIs inside java
are using int constants.
*
revert the method signatures to maintain binary compatibility. This means
I have to use a cast in a couple places but I think its a pretty safe one
(ie, I'm not sure theres a flow where the XPathStrategy can return a
marshaller that isn't a XPathMarshaller).
??? Just add the method asking for support of references in Mapper.
XStream's binary contract in this area is based on derived types of
MapperWrapper. There's no cast necessary.
Post by Geoff Groos
*
I'll remove the enums and use static integers (maybe with an odd range of
numbers so-as to allow for an assertion that the caller used the proper
integers)
Just a boolean. Support references for a type or not.
Post by Geoff Groos
And of course when I send you any diffs I'll remove the class files --that
was simply me bashing your build system into a repo that I didn't really
spend much time setting up. I just wanted a concise list of the diffs,
which a github pull-request provides.
OK. Note, xtream repo is now at x-stream.github.com/xstream.

Cheers,
Jörg


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Loading...