This is the first article about running an AI model on Mac Studio, and I will continue to migrate the environment from CUDA / Nvidia GPU to Mac MPS.
Why did I choose Mac Studio?
I chose Mac Studio because it is less expensive. It has 192GB of memory that can be used as a GPU. This means that it is possible to migrate the program from Nvidia GPU and save some money for personal use.
What is Fuyu-8B?
We are releasing Fuyu-8B, a small version of the multimodal model that powers our product. The model is available on HuggingFace. We think Fuyu-8B is exciting because:
It has a much simpler architecture and training procedure than other multimodal models, making it easier to understand, scale, and deploy.
It is designed from the ground up for digital agents, so it can support arbitrary image resolutions, answer questions about graphs and diagrams, answer UI-based questions, and perform fine-grained localization on screen images.
It is fast – we can get responses for large images in less than 100 milliseconds.
Despite being optimized for our use case, it performs well at standard image understanding benchmarks such as visual question-answering and natural-image-captioning.
Ok, let’s do it now.
Prepare the environment:
You need Python 3.6+ and virtualenv installed. Conda or venv also work.
cssCopy code
virtualenv -p python3 py3
Download the HuggingFace transformers and clone the transformer from GitHub.
bashCopy code
git clone https://github.com/huggingface/transformers.git cd transformers pip install .
You are almost done here; now we can start the samples.
Sample 1:
from transformers import FuyuProcessor, FuyuForCausalLM
from PIL import Image
# load model and processor
model_id = "."
processor = FuyuProcessor.from_pretrained(model_id)
model = FuyuForCausalLM.from_pretrained(model_id, device_map="mps", torch_dtype=torch.float16)
# prepare inputs for the model
text_prompt = "Generate a coco-style caption.\n"
image_path = "bus.png" # https://huggingface.co/adept-hf-collab/fuyu-8b/blob/main/bus.png
image = Image.open(image_path)
inputs = processor(text=text_prompt, images=image, return_tensors="pt")
for k, v in inputs.items():
inputs[k] = v.to("mps")
# autoregressively generate text
generation_output = model.generate(**inputs, max_new_tokens=7)
generation_text = processor.batch_decode(generation_output[:, -7:], skip_special_tokens=True)
print(generation_text)
Sample 2:
import os
from transformers import FuyuProcessor, FuyuForCausalLM
from PIL import Image
import torch
def list_files_in_directory(path, extensions=[".png", ".jpeg", ".jpg", ".JPG", ".PNG", ".JPEG"]):
files = [f for f in os.listdir(path) if os.path.isfile(os.path.join(path, f)) and any(f.endswith(ext) for ext in extensions)]
return files
def main():
# load model and processor
model_id = "." #adept/fuyu-8b"
processor = FuyuProcessor.from_pretrained(model_id)
model = FuyuForCausalLM.from_pretrained(model_id, device_map="mps", torch_dtype=torch.float16) # To solve OOM, float16 enables operation with only 24GB of VRAM. Alternatively float16 can be replaced with bfloat16 with differences in loading time and inference time.
# Load last image path or ask user
try:
with open("last_path.txt", "r") as f:
last_path = f.read().strip()
user_input = input(f"Do you want to use the last path '{last_path}'? (yes/no, default yes): ")
if not user_input or user_input.lower() != 'no':
last_path = last_path
else:
raise ValueError("User chose to input a new path.")
except:
last_path = input("Please provide the image directory path: ")
with open("last_path.txt", "w") as f:
f.write(last_path)
while True:
# List the first 10 images in the directory
images = list_files_in_directory(last_path)[:10]
for idx, image in enumerate(images, start=1):
print(f"{idx}. {image}")
# Allow the user to select an image
image_choice = input(f"Choose an image (1-{len(images)}) or enter its name: ")
try:
idx = int(image_choice)
image_path = os.path.join(last_path, images[idx-1])
except ValueError:
image_path = os.path.join(last_path, image_choice)
try:
image = Image.open(image_path)
except:
print("Cannot open the image. Please check the path and try again.")
continue
questions = [
"Generate a coco-style caption.",
"What color is the object?",
"Describe the scene.",
"Describe the facial expression of the character.",
"Tell me about the story from the image.",
"Enter your own question"
]
# Asking the user to select a question from list, or select to input one
for idx, q in enumerate(questions, start=1):
print(f"{idx}. {q}")
q_choice = int(input("Choose a question or enter your own: "))
if q_choice <= 5:
text_prompt = questions[q_choice-1] + '\n'
else:
text_prompt = input("Please enter your question: ") + '\n'
while True: # To enable the user to ask further question about an image
inputs = processor(text=text_prompt, images=image, return_tensors="pt")
for k, v in inputs.items():
inputs[k] = v.to("mps")
# To eliminate attention_mask warning
inputs["attention_mask"] = torch.ones(inputs["input_ids"].shape, device="mps")
generation_output = model.generate(**inputs, max_new_tokens=50, pad_token_id=model.config.eos_token_id)
generation_text = processor.batch_decode(generation_output[:, -50:], skip_special_tokens=True)
print("Answer:", generation_text[0])
text_prompt = input("Ask another question about the same image or type '/exit' to exit: ") + '\n'
if text_prompt.strip() == '/exit':
break
#if name == "main":
main()
yes, It is Chinese. But not the Chinese fuyu-7b knows. It is not “食” (eating) , but “我不想洗碗”( i don’t want to wash the dishes). Fuyu-7b is lying. lol.
This tool allows you to redirect any TCP connection to SOCKS or HTTPS proxy using your firewall, so redirection may be system-wide or network-wide.
When is redsocks useful?
you want to route part of TCP traffic via OpenSSH DynamicForward Socks5 port using firewall policies. That was original redsocks development goal;
you use DVB ISP and this ISP provides internet connectivity with some special daemon that may be also called “Internet accelerator” and the accelerator acts as a proxy and has no “transparent proxy” feature and you need it. Globax was an example of alike accelerator, but Globax 5 has transparent proxy feature. That was the second redsocks` development goal;
you have to pass traffic through proxy due to corporate network limitation. That was never a goal for redsocks, but users have reported success with some proxy configurations.
Copied>>> from diffusers import UNet2DModel >>> model = UNet2DModel( … sample_size=config.image_size, # the target image resolution … in_channels=3, # the number of input channels, 3 for RGB images … out_channels=3, # the number of output channels … layers_per_block=2, # how many ResNet layers to use per UNet block … block_out_channels=(128, 128, 256, 256, 512, 512), # the number of output channels for each UNet block … down_block_types=( … “DownBlock2D”, # a regular ResNet downsampling block … “DownBlock2D”, … “DownBlock2D”, … “DownBlock2D”, … “AttnDownBlock2D”, # a ResNet downsampling block with spatial self-attention … “DownBlock2D”, … ), … up_block_types=( … “UpBlock2D”, # a regular ResNet upsampling block … “AttnUpBlock2D”, # a ResNet upsampling block with spatial self-attention … “UpBlock2D”, … “UpBlock2D”, … “UpBlock2D”, … “UpBlock2D”, … ), … )
Copied>>> from diffusers import DDPMPipeline >>> from diffusers.utils import make_image_grid >>> import math >>> import os >>> def evaluate(config, epoch, pipeline): … # Sample some images from random noise (this is the backward diffusion process). … # The default pipeline output type is `List[PIL.Image]` … images = pipeline( … batch_size=config.eval_batch_size, … generator=torch.manual_seed(config.seed), … ).images … # Make a grid out of the images … image_grid = make_image_grid(images, rows=4, cols=4) … # Save the images … test_dir = os.path.join(config.output_dir, “samples”) … os.makedirs(test_dir, exist_ok=True) … image_grid.save(f”{test_dir}/{epoch:04d}.png”)
1. 当nonce太小(小于当前的nonce值),交易会被直接拒绝,Transactions with too low a nonce get immediately rejected;
2. 当nonce太大,大于当前nonce,交易会一直处于队列之中,Transactions with too high a nonce get placed in the transaction pool queue;
3.当发送一个比较大的nonce值,然后补齐开始nonce到那个值之间的nonce,那么交易依旧可以被执行,If transactions with nonces that fill the gap between the last valid nonce and the too high nonce are sent and the nonce sequence is complete, all the transactions in the sequence will get processed and mined.
4. 交易队列只保存最多64个从同一个账户发出的交易,The transaction pool queue will only hold a maximum of 64 transactions with the same From:address with nonces out of sequence. 也就是说,如果要批量转账,同一节点不要发出超过64笔交易。
5.当某节点queue中还有交易,但此时停止geth客户端,queue中的交易会被清除掉,When the geth instances are shut down and restarted, transactions in the transaction pool queue disappear.