Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
238 views
in Technique[技术] by (71.8m points)

java - How to stream data to database BLOB using Hibernate (no in-memory storing in byte[])

I'm looking for a way to stream binary data to/from database. If possible, i'd like it to be done with Hibernate (in database agnostic way). All solutions I've found involve explicit or implicit loading of binary data into memory as byte[]. I need to avoid it. Let's say I want my code to be able to write to a local file a 2GB video from database (stored in BLOB column), or the other way around, using no more than 256Mb of memory. It's clearly achievable, and involves no voodoo. But I can't find a way, for now I'm trying to avoid debugging Hibernate.

Let's look at sample code (keeping in mind -Jmx=256Mb).

Entity class:

public class SimpleBean {
    private Long id;
    private Blob data;
    // ... skipping getters, setters and constructors.
}

Hibernate mapping fragment:

<class name="SimpleBean" table="SIMPLE_BEANS">
    <id name="id" column="SIMPLE_BEAN_ID">
        <generator class="increment" />
    </id>
    <property name="data" type="blob" column="DATA" />
</class>

Test code fragment:

Configuration cfg = new Configuration().configure("hibernate.cfg.xml");
ServiceRegistry serviceRegistry = new ServiceRegistryBuilder()
                                      .applySettings(cfg.getProperties())
                                      .buildServiceRegistry();

SessionFactory sessionFactory = cfg.buildSessionFactory(serviceRegistry);
Session session = sessionFactory.openSession();
session.beginTransaction();

File dataFile = new File("movie_1gb.avi");
long dataSize = dataFile.length();
InputStream dataStream = new FileInputStream(dataFile);

LobHelper lobHelper = session.getLobHelper();
Blob dataBlob = lobHelper.createBlob(dataStream, dataSize);

session.save( new SimpleBean(data) );
session.getTransaction().commit(); // Throws java.lang.OutOfMemoryError
session.close();

blobStream.close();
sessionFactory.close();

When running that snippet I get OutOfMemory exception. Looking at stack trace shows what Hibernate tries to load the stream in memory and gets OutOfMemory (as it should). Here's stack trace:

java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
at org.hibernate.type.descriptor.java.DataHelper.extractBytes(DataHelper.java:183)
at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:121)
at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:45)
at org.hibernate.type.descriptor.sql.BlobTypeDescriptor$4$1.doBind(BlobTypeDescriptor.java:105)
at org.hibernate.type.descriptor.sql.BasicBinder.bind(BasicBinder.java:92)
at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:305)
at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:300)
at org.hibernate.type.AbstractSingleColumnStandardBasicType.nullSafeSet(AbstractSingleColumnStandardBasicType.java:57)
at org.hibernate.persister.entity.AbstractEntityPersister.dehydrate(AbstractEntityPersister.java:2603)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2857)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:3301)
at org.hibernate.action.internal.EntityInsertAction.execute(EntityInsertAction.java:88)
at org.hibernate.engine.spi.ActionQueue.execute(ActionQueue.java:362)
at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:354)
at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:275)
at org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:326)
at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:52)
at org.hibernate.internal.SessionImpl.flush(SessionImpl.java:1214)
at org.hibernate.internal.SessionImpl.managedFlush(SessionImpl.java:403)
at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.beforeTransactionCommit(JdbcTransaction.java:101)
at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.commit(AbstractTransactionImpl.java:175)
at ru.swemel.msgcenter.domain.SimpleBeanTest.testBasicUsage(SimpleBeanTest.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)

Used Hibernate 4.1.5.SP1. The exact question is: how to avoid loading stream into memory when storing a blob in database using Hibernate, using direct streaming instead. I'd like to avoid off topics about why one stores video in column of database instead of storing it in some content repository and linking. Please, consider it a model what is irrelevant to the question.

It seems that there might be some kind of capabilities on different dialects and Hibernate might try to load everything in memory, because underlying database doesn't support streaming blobs or something like that. If it's the case - i'd like to see some kind of comparative table between different dialects in aspect of handling blobs.

Thank you very much for your help!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

For those looking for same thing.

My bad, the code works as supposed to (streams without trying to copy to memory) for PostgreSQL (and probably lots of others). The inner work of Hibernate depends on selected dialect. The one I used in the first place overrides direct use of streams in favor of BinaryStream backed by byte[].

Also there are no problems with performance, since it loads only OID (number) in case of PostgreSQL, and probably lazy loads data in case of other dialects (including byte[] implementation). Just ran some dirty tests, no visible difference in 10000 loads of entity with and without binary data field.

Storing data in database seems to be slower than just saving it on disk as external file though. But it saves you a lot of headache when backing up, or dealing with limitations of particular file system, or concurrent updates, etc. But it's an off-topic.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...