Add training configs for SuperPoint (MagicLeap) and DISK with LightGlue (#18)

* add superpoint and disk train configs * update README
2023-10-16 13:25:07 +02:00 · 2023-10-16 13:25:07 +02:00 · 692c72f94c
parent 398c4b8c21
commit 692c72f94c
5 changed files with 244 additions and 9 deletions
--- a/README.md
+++ b/README.md
@ -25,7 +25,7 @@ python3 -m pip install -e .[extra]
 All models and datasets in gluefactory have auto-downloaders, so you can get started right away!
 ## License
-The code and trained models in Glue Factory are released with an Apache-2.0 license. This includes LightGlue trained with an [open version of SuperPoint](https://github.com/rpautrat/SuperPoint). Third-party models that are not compatible with this license, such as SuperPoint (original) and SuperGlue, are provided in `gluefactory_nonfree`, where each model might follow its own, restrictive license.
+The code and trained models in Glue Factory are released with an Apache-2.0 license. This includes LightGlue and an [open version of SuperPoint](https://github.com/rpautrat/SuperPoint). Third-party models that are not compatible with this license, such as SuperPoint (original) and SuperGlue, are provided in `gluefactory_nonfree`, where each model might follow its own, restrictive license.
 ## Evaluation
@ -223,18 +223,18 @@ All training commands automatically download the datasets.
 <details>
 <summary>[Training LightGlue]</summary>
-We show how to train LightGlue with [SuperPoint open](https://github.com/rpautrat/SuperPoint).
+We show how to train LightGlue with [SuperPoint](https://github.com/magicleap/SuperPointPretrainedNetwork).
 We first pre-train LightGlue on the homography dataset:
 ```bash
 python -m gluefactory.train sp+lg_homography \  # experiment name
-    --conf gluefactory/configs/superpoint-open+lightglue_homography.yaml
+    --conf gluefactory/configs/superpoint+lightglue_homography.yaml
 ```
 Feel free to use any other experiment name. By default the checkpoints are written to `outputs/training/`. The default batch size of 128 corresponds to the results reported in the paper and requires 2x 3090 GPUs with 24GB of VRAM each as well as PyTorch >= 2.0 (FlashAttention).
 Configurations are managed by [OmegaConf](https://omegaconf.readthedocs.io/) so any entry can be overridden from the command line.
 If you have PyTorch < 2.0 or weaker GPUs, you may thus need to reduce the batch size via:
 ```bash
 python -m gluefactory.train sp+lg_homography \
-    --conf gluefactory/configs/superpoint-open+lightglue_homography.yaml  \
+    --conf gluefactory/configs/superpoint+lightglue_homography.yaml  \
    data.batch_size=32  # for 1x 1080 GPU
 ```
 Be aware that this can impact the overall performance. You might need to adjust the learning rate accordingly.
@ -242,17 +242,17 @@ Be aware that this can impact the overall performance. You might need to adjust
 We then fine-tune the model on the MegaDepth dataset:
 ```bash
 python -m gluefactory.train sp+lg_megadepth \
-    --conf gluefactory/configs/superpoint-open+lightglue_megadepth.yaml \
+    --conf gluefactory/configs/superpoint+lightglue_megadepth.yaml \
    train.load_experiment=sp+lg_homography
 ```
 Here the default batch size is 32. To speed up training on MegaDepth, we suggest to cache the local features before training (requires around 150 GB of disk space):
 ```bash
 # extract features
-python -m gluefactory.scripts.export_megadepth --method sp_open --num_workers 8
+python -m gluefactory.scripts.export_megadepth --method sp --num_workers 8
 # run training with cached features
 python -m gluefactory.train sp+lg_megadepth \
-    --conf gluefactory/configs/superpoint-open+lightglue_megadepth.yaml \
+    --conf gluefactory/configs/superpoint+lightglue_megadepth.yaml \
    train.load_experiment=sp+lg_homography \
    data.load_features.do=True
 ```
@ -297,10 +297,10 @@ Using the following local feature extractors:
 | Model     | LightGlue config |
 | --------- | --------- |
 | [SuperPoint (open)](https://github.com/rpautrat/SuperPoint) | `superpoint-open+lightglue_{homography,megadepth}.yaml` |
-| [SuperPoint (official)](https://github.com/magicleap/SuperPointPretrainedNetwork) | ❌ TODO |
+| [SuperPoint (official)](https://github.com/magicleap/SuperPointPretrainedNetwork) | `superpoint+lightglue_{homography,megadepth}.yaml` |
 | SIFT (via [pycolmap](https://github.com/colmap/pycolmap)) | `sift+lightglue_{homography,megadepth}.yaml` |
 | [ALIKED](https://github.com/Shiaoming/ALIKED) | `aliked+lightglue_{homography,megadepth}.yaml` |
-| [DISK](https://github.com/cvlab-epfl/disk) | ❌ TODO |
+| [DISK](https://github.com/cvlab-epfl/disk) | `disk+lightglue_{homography,megadepth}.yaml` |
 | Key.Net + HardNet | ❌ TODO |
 ## Coming soon
--- a/gluefactory/configs/disk+lightglue_homography.yaml
+++ b/gluefactory/configs/disk+lightglue_homography.yaml
@ -0,0 +1,47 @@
 data:
    name: homographies
    data_dir: revisitop1m
    train_size: 150000
    val_size: 2000
    batch_size: 128
    num_workers: 14
    homography:
        difficulty: 0.7
        max_angle: 45
    photometric:
        name: lg
 model:
    name: two_view_pipeline
    extractor:
        name: extractors.disk_kornia
        max_num_keypoints: 512
        force_num_keypoints: True
        detection_threshold: 0.0
        trainable: False
    ground_truth:
        name: matchers.homography_matcher
        th_positive: 3
        th_negative: 3
    matcher:
        name: matchers.lightglue
        filter_threshold: 0.1
        input_dim: 128
        flash: false
        checkpointed: true
 train:
    seed: 0
    epochs: 40
    log_every_iter: 100
    eval_every_iter: 500
    lr: 1e-4
    lr_schedule:
        start: 20
        type: exp
        on_epoch: true
        exp_div_10: 10
    plot: [5, 'gluefactory.visualization.visualize_batch.make_match_figures']
 benchmarks:
    hpatches:
      eval:
        estimator: opencv
        ransac_th: 0.5
--- a/gluefactory/configs/disk+lightglue_megadepth.yaml
+++ b/gluefactory/configs/disk+lightglue_megadepth.yaml
@ -0,0 +1,70 @@
 data:
    name: megadepth
    preprocessing:
        resize: 1024
        side: long
        square_pad: True
    train_split: train_scenes_clean.txt
    train_num_per_scene: 300
    val_split: valid_scenes_clean.txt
    val_pairs: valid_pairs.txt
    min_overlap: 0.1
    max_overlap: 0.7
    num_overlap_bins: 3
    read_depth: true
    read_image: true
    batch_size: 32
    num_workers: 14
    load_features:
        do: false  # enable this if you have cached predictions
        path: exports/megadepth-undist-depth-r1024_DISK-k2048-nms5/{scene}.h5
        padding_length: 2048
        padding_fn: pad_local_features
 model:
    name: two_view_pipeline
    extractor:
        name: extractors.disk_kornia
        max_num_keypoints: 512
        force_num_keypoints: True
        detection_threshold: 0.0
        trainable: False
    ground_truth:
        name: matchers.homography_matcher
        th_positive: 3
        th_negative: 3
    matcher:
        name: matchers.lightglue
        filter_threshold: 0.1
        input_dim: 128
        flash: false
        checkpointed: true
    allow_no_extract: True
 train:
    seed: 0
    epochs: 50
    log_every_iter: 100
    eval_every_iter: 1000
    lr: 1e-4
    lr_schedule:
        start: 30
        type: exp
        on_epoch: true
        exp_div_10: 10
    dataset_callback_fn: sample_new_items
    plot: [5, 'gluefactory.visualization.visualize_batch.make_match_figures']
 benchmarks:
    megadepth1500:
        data:
            preprocessing:
                side: long
                resize: 1024
        eval:
            estimator: opencv
            ransac_th: 0.5
    hpatches:
        eval:
            estimator: opencv
            ransac_th: 0.5
        model:
            extractor:
                max_num_keypoints: 1024
--- a/gluefactory/configs/superpoint+lightglue_homography.yaml
+++ b/gluefactory/configs/superpoint+lightglue_homography.yaml
@ -0,0 +1,47 @@
 data:
    name: homographies
    data_dir: revisitop1m
    train_size: 150000
    val_size: 2000
    batch_size: 128
    num_workers: 14
    homography:
        difficulty: 0.7
        max_angle: 45
    photometric:
        name: lg
 model:
    name: two_view_pipeline
    extractor:
        name: gluefactory_nonfree.superpoint
        max_num_keypoints: 512
        force_num_keypoints: True
        detection_threshold: 0.0
        nms_radius: 3
        trainable: False
    ground_truth:
        name: matchers.homography_matcher
        th_positive: 3
        th_negative: 3
    matcher:
        name: matchers.lightglue
        filter_threshold: 0.1
        flash: false
        checkpointed: true
 train:
    seed: 0
    epochs: 40
    log_every_iter: 100
    eval_every_iter: 500
    lr: 1e-4
    lr_schedule:
        start: 20
        type: exp
        on_epoch: true
        exp_div_10: 10
    plot: [5, 'gluefactory.visualization.visualize_batch.make_match_figures']
 benchmarks:
    hpatches:
      eval:
        estimator: opencv
        ransac_th: 0.5
--- a/gluefactory/configs/superpoint+lightglue_megadepth.yaml
+++ b/gluefactory/configs/superpoint+lightglue_megadepth.yaml
@ -0,0 +1,71 @@
 data:
    name: megadepth
    preprocessing:
        resize: 1024
        side: long
        square_pad: True
    train_split: train_scenes_clean.txt
    train_num_per_scene: 300
    val_split: valid_scenes_clean.txt
    val_pairs: valid_pairs.txt
    min_overlap: 0.1
    max_overlap: 0.7
    num_overlap_bins: 3
    read_depth: true
    read_image: true
    batch_size: 32
    num_workers: 14
    load_features:
        do: false  # enable this if you have cached predictions
        path: exports/megadepth-undist-depth-r1024_SP-k2048-nms3/{scene}.h5
        padding_length: 2048
        padding_fn: pad_local_features
 model:
    name: two_view_pipeline
    extractor:
        name: gluefactory_nonfree.superpoint
        max_num_keypoints: 2048
        force_num_keypoints: True
        detection_threshold: 0.0
        nms_radius: 3
        trainable: False
    matcher:
        name: matchers.lightglue
        filter_threshold: 0.1
        flash: false
        checkpointed: true
    ground_truth:
        name: matchers.depth_matcher
        th_positive: 3
        th_negative: 5
        th_epi: 5
    allow_no_extract: True
 train:
    seed: 0
    epochs: 50
    log_every_iter: 100
    eval_every_iter: 1000
    lr: 1e-4
    lr_schedule:
        start: 30
        type: exp
        on_epoch: true
        exp_div_10: 10
    dataset_callback_fn: sample_new_items
    plot: [5, 'gluefactory.visualization.visualize_batch.make_match_figures']
 benchmarks:
    megadepth1500:
        data:
            preprocessing:
                side: long
                resize: 1600
        eval:
            estimator: opencv
            ransac_th: 0.5
    hpatches:
        eval:
            estimator: opencv
            ransac_th: 0.5
        model:
            extractor:
                max_num_keypoints: 1024