I'm currently working with YOLOv4 and COCO dataset. I'm creating the model directly from the weights (you can download them here) and then transforming it to TensorFlow Lite format with the next parameters:
model = load_yolo(weights) # Function to create the yolo model & load the weights
tflite_converter = tflite.TFLiteConverter.from_keras_model(model)
tflite_model = tflite_converter.convert()
My main goal is to get a faster YOLOv4 model (and hopefully smaller). Currently, I'm executing the model on my laptop and it takes between 3 and 4 seconds to run inference on a single image. That is the reason why I decided to transform the model to TensorFlow Lite format. However, I'm getting very similar results time-wise (it also takes between 3 and 4 seconds to run inference on a single image with the model transformed to TensorFlow Lite format). Furthermore, the resulting .tflite file is more or less of the same size than the weights file (~250MB). In addition, when quantizing the model (tflite_converter.optimizations = [tflite.Optimize.DEFAULT]
) I get even worse results timewise (the .tflite file is ~4x smaller though).
Even thought TensorFlow Lite is said to be optimized for edge devices, when transforming a model with this tool in my laptop, I used to always get faster (and smaller) models. That is why I decided to take this approach with YOLOv4. Is there a reason for getting these 'bad performance' results with the YOLOv4 model? Is this a good approach to get a faster version of YOLOv4 on my laptop? And, are there any other options or better approaches to get a faster version of YOLOv4 (like YOLOv4-tiny)?
NOTE: I have installed TensorFlow 2.4.1.
question from:
https://stackoverflow.com/questions/65950717/why-there-is-no-improvement-in-inference-time-size-when-yolov4-is-transformed-wi 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…