[ICCV 2019] Harmonious Bottleneck on Two Orthogonal Dimensions, surpassing MobileNetV2
Official implementation of our HBONet architecture as described in HBONet: Harmonious Bottleneck on Two Orthogonal Dimensions (ICCV'19) by Duo Li, Aojun Zhou and Anbang Yao on ILSVRC2012 benchmark with PyTorch framework.
We integrate our HBO modules into the state-of-the-art MobileNetV2 backbone as a reference case. Baseline models of MobileNetV2 counterparts are available in my repository mobilenetv2.pytorch.
Download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh
The following statistics are reported on the ILSVRC2012 validation set with single center crop testing.
Architecture | MFLOPs | Top-1 / Top-5 Acc. (%) |
---|---|---|
HBONet 1.0 | 305 | 73.1 / 91.0 |
HBONet 0.8 | 205 | 71.3 / 89.7 |
HBONet 0.5 | 96 | 67.0 / 86.9 |
HBONet 0.35 | 61 | 62.4 / 83.7 |
HBONet 0.25 | 37 | 57.3 / 79.8 |
HBONet 0.1 | 14 | 41.5 / 65.7 |
Architecture | MFLOPs | Top-1 / Top-5 Acc. (%) |
---|---|---|
HBONet 0.8 224x224 | 205 | 71.3 / 89.7 |
HBONet 0.8 192x192 | 150 | 70.0 / 89.2 |
HBONet 0.8 160x160 | 105 | 68.3 / 87.8 |
HBONet 0.8 128x128 | 68 | 65.5 / 85.9 |
HBONet 0.8 96x96 | 39 | 61.4 / 83.0 |
Architecture | MFLOPs | Top-1 / Top-5 Acc. (%) |
---|---|---|
HBONet 0.35 224x224 | 61 | 62.4 / 83.7 |
HBONet 0.35 192x192 | 45 | 60.9 / 82.6 |
HBONet 0.35 160x160 | 31 | 58.6 / 80.7 |
HBONet 0.35 128x128 | 21 | 55.2 / 78.0 |
HBONet 0.35 96x96 | 12 | 50.3 / 73.8 |
Architecture | MFLOPs | Top-1 / Top-5 Acc. (%) |
---|---|---|
HBONet 0.5 224x224 | 98 | 67.7 / 87.4 |
HBONet 0.6 192x192 | 108 | 67.3 / 87.3 |
Architecture | MFLOPs | Top-1 / Top-5 Acc. (%) |
---|---|---|
HBONet(2x) 0.25 | 44 | 58.3 / 80.6 |
HBONet(4x) 0.25 | 45 | 59.3 / 81.4 |
HBONet(8x) 0.25 | 45 | 58.2 / 80.4 |
Taking HBONet 1.0 as an example, pretrained models can be easily imported using the following lines and then finetuned for other vision tasks or utilized in resource-aware platforms. (To create variant models in Table 5 & 6, it is necessary to make slight modifications following the instructions in the docstrings of the model file in advance.)
from models.imagenet import hbonet
net = hbonet()
net.load_state_dict(torch.load('pretrained/hbonet_1_0.pth'))
Configuration to reproduce our reported results, totally the same as mobilenetv2.pytorch for fair comparison.
python imagenet.py \
-a hbonet \
-d <path-to-ILSVRC2012-data> \
--epochs 150 \
--lr-decay cos \
--lr 0.05 \
--wd 4e-5 \
-c <path-to-save-checkpoints> \
--width-mult <width-multiplier> \
--input-size <input-resolution> \
-j <num-workers>
python imagenet.py \
-a hbonet \
-d <path-to-ILSVRC2012-data> \
--weight <pretrained-pth-file> \
--width-mult <width-multiplier> \
--input-size <input-resolution> \
-e
If you find our work useful in your research, please consider citing:
@InProceedings{Li_2019_ICCV,
author = {Li, Duo and Zhou, Aojun and Yao, Anbang},
title = {HBONet: Harmonious Bottleneck on Two Orthogonal Dimensions},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2019}
}