Main

related bits

0

processing priority

3

site type

0 (generic, awaiting analysis)

review version

11

html import

20 (imported)

Events

first seen date

2025-01-01 12:51:12

expired found date

-

created at

2025-01-01 12:51:12

updated at

2025-01-01 12:51:13

Domain name statistics

length

20

crc

20078

tld

86

nm parts

0

nm random digits

0

nm rare letters

0

Connections

is subdomain of id

87719371 (github.io)

previous id

0

replaced with id

0

related id

-

dns primary id

0

dns alternative id

0

lifecycle status

0 (unclassified, or currently active)

Subdomains and pages

deleted subdomains

0

page imported products

0

page imported random

0

page imported parking

0

Error counters

count skipped due to recent timeouts on the same server IP

0

count content received but rejected due to 11-799

0

count dns errors

0

count cert errors

0

count timeouts

0

count http 429

0

count http 404

0

count http 403

0

count http 5xx

0

next operation date

-

Server

server bits

server ip

-

Mainpage statistics

mp import status

20

mp rejected date

-

mp saved date

-

mp size orig

17699

mp size raw text

4956

mp inner links count

0

mp inner links status

1 (no links)

Open Graph

title

description

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

image

site name

author

updated

2026-03-03 23:23:00

raw text

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Chuofan Ma 1 ,&nbsp&nbsp&nbsp Yi Jiang 2† ,&nbsp&nbsp&nbsp Jiannan Wu 1 ,&nbsp&nbsp&nbsp Zehuan Yuan 2 ,&nbsp&nbsp&nbsp Xiaojuan Qi 1† ,&nbsp&nbsp&nbsp 1 The University of Hong Kong,&nbsp&nbsp&nbsp 2 ByteDance Inc. Paper Code &#129303 Demo (Comming Soon) Abstract We introduce Groma, a Multimodal Large Language Model (MLLM) with grounded and fine-grained visual perception ability. Beyond holistic image understanding, Groma is adept at region-level tasks such as region captioning and visual grounding. Such capabilities are built upon a localized visual tokenization mechanism, where an image is decomposed into regions of interest and subsequently encoded into region tokens. By integrating region tokens into user instructions and model responses...

Text analysis

redirect type

0 (-)

block type

0 (no issues)

detected language

1 (English)

category id

Zastosowania AI (149)

index version

1

spam phrases

0

Text statistics

text nonlatin

0

text cyrillic

0

text characters

3692

text words

624

text unique words

289

text lines

85

text sentences

32

text paragraphs

10

text words per sentence

19

text matched phrases

0

text matched dictionaries

0

RSS

rss path

rss status

1 (priority 1 already searched, no matches found)

rss found date

-

rss size orig

0

rss items

0

rss spam phrases

0

rss detected language

0 (awaiting analysis)

inbefore feed id

-

inbefore status

0 (new)

Sitemap

sitemap path

sitemap status

1 (priority 1 already searched, no matches found)

sitemap review version

1

sitemap urls count

0

sitemap urls adult

0

sitemap filtered products

0

sitemap filtered videos

0

sitemap found date

-

sitemap process date

-

sitemap first import date

-

sitemap last import date

-