Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-2133

Provide a context for object de-serialization

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

          [ISPN-2133] Provide a context for object de-serialization

          One more thought:

          traditionally such problems are solved by Java's standard Serialization.

          Imagine you're sending in bulk / batches, writing many objects which have some large string in common, or sharing any other large object.

          Standard serialization - including JBoss Marshalling - already do support object graphs and therefore re-materialize the object as one.

          So it could be that serializing all values of a batch / transaction / queue of operations into one single Blob is much more efficient than serializing multiple commands independently and pack them together after individual serialization; if we could improve on this, we'd be able to reap benefits at basically no API change cost.

          Incidentally there would be two benefits:

          • smaller payload on the wire
          • smaller heap consumption on the receiver's size, as same objects would be deserialized "as one" rather than being materialized as multiple copies

          Thinking about this, it implies that Infinispan might have a problem with "size amplification": if I estimate my object size locally to try to estimate how much space my data grid will need, I'm doing it wrong as the estimate we'd do today would not take into account of this size amplification.

          Sanne Grinovero (Inactive) added a comment - - edited One more thought: traditionally such problems are solved by Java's standard Serialization. Imagine you're sending in bulk / batches, writing many objects which have some large string in common, or sharing any other large object. Standard serialization - including JBoss Marshalling - already do support object graphs and therefore re-materialize the object as one. So it could be that serializing all values of a batch / transaction / queue of operations into one single Blob is much more efficient than serializing multiple commands independently and pack them together after individual serialization; if we could improve on this, we'd be able to reap benefits at basically no API change cost. Incidentally there would be two benefits: smaller payload on the wire smaller heap consumption on the receiver's size, as same objects would be deserialized "as one" rather than being materialized as multiple copies Thinking about this, it implies that Infinispan might have a problem with "size amplification": if I estimate my object size locally to try to estimate how much space my data grid will need, I'm doing it wrong as the estimate we'd do today would not take into account of this size amplification.

          Blocking support for provided CacheManagers (e.g. via JDNI) in Hibernate OGM
          https://hibernate.atlassian.net/browse/OGM-696

          Emmanuel Bernard added a comment - Blocking support for provided CacheManagers (e.g. via JDNI) in Hibernate OGM https://hibernate.atlassian.net/browse/OGM-696

          It kinda depends on the scope of the context.
          If that's per Infinispan operation, I think it would be really hard to do (esp with the async stuff).
          If that's per cache. I guess when you create a cache you could have some fluent API of sort to attach some context that would be provided to the externalizer. Or better and more simply, offer a way to attach externalizers just for a specific cache. In which case we could add state to them via their constructor.

          Emmanuel Bernard added a comment - It kinda depends on the scope of the context. If that's per Infinispan operation, I think it would be really hard to do (esp with the async stuff). If that's per cache. I guess when you create a cache you could have some fluent API of sort to attach some context that would be provided to the externalizer. Or better and more simply, offer a way to attach externalizers just for a specific cache. In which case we could add state to them via their constructor.

          sgrinove the initial description didn't say anything about having a different context for each cache

          sgrinove rhn-engineering-ebernard any suggestions on how the API should look?

          Dan Berindei (Inactive) added a comment - sgrinove the initial description didn't say anything about having a different context for each cache sgrinove rhn-engineering-ebernard any suggestions on how the API should look?

          Another issue that is making OGM a bit less efficient than it should. Should we create a tag SLOWING_OGM?

          Emmanuel Bernard added a comment - Another issue that is making OGM a bit less efficient than it should. Should we create a tag SLOWING_OGM?

          dberinde@redhat.com that's an interesting idea, but we'd need to point to a different external service depending on the Cache. Not having a context, we don't even know which cache name is the context.

          Sanne Grinovero (Inactive) added a comment - dberinde@redhat.com that's an interesting idea, but we'd need to point to a different external service depending on the Cache. Not having a context, we don't even know which cache name is the context.

          Dan Berindei (Inactive) added a comment - - edited

          We don't currently allow SPIs to use our injection mechanism, so we'd need to start a discussion about that before we can think about the injection option.

          Modifying the marshalling API to give the externalizer a reference to the cache manager would be possible, but I don't think doing a component lookup on every deserialization is a good idea.

          As a workaround, you can construct your externalizer with a reference to your service, and register it with

          globalConfigBuilder.serialization().addAdvancedExternalizer(new CustomExternalizer(service))
          

          Dan Berindei (Inactive) added a comment - - edited We don't currently allow SPIs to use our injection mechanism, so we'd need to start a discussion about that before we can think about the injection option. Modifying the marshalling API to give the externalizer a reference to the cache manager would be possible, but I don't think doing a component lookup on every deserialization is a good idea. As a workaround, you can construct your externalizer with a reference to your service, and register it with globalConfigBuilder.serialization().addAdvancedExternalizer( new CustomExternalizer(service))

            rh-ee-galder Galder Zamarreño
            sgrinove Sanne Grinovero (Inactive)
            Archiver:
            rhn-support-adongare Amol Dongare

              Created:
              Updated:
              Resolved:
              Archived: